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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 


1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 


2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

15 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotidesand cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 
sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 
sequences are designated as SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. The 
polypeptides sequences are designated SEQ ID NO: 985-1 968, 2953-3936, 3943-3948 or 3955- 
3960. The nucleic acids and polypeptides are provided in the Sequence Listing. In the nucleic acids 
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
1 0 is any of the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 under 
stringent hybridization conditions; nucleic acid sequences which are allelic variants or species 

1 5 homologues of any of the nucleic acid sequences recited above, or nucleic acid sequences that 
encode a peptide comprising a specific domain or truncation of the peptides encoded by SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. A polynucleotide comprising a nucleotide 
sequence having at least 90% identity to an identifying sequence of SEQ ID NO: 1 -984, 1 969-2952, 
3937-3942 or 3949-3954 or a degenerate variant or fragment thereof. The identifying sequence can 

20 be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
from the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954. The 
sequence information can be a segment of any one of SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 
3949-3954 that uniquely identifies or represents the sequence information of SEQ ID NO: 1 -984, 

25 1 969-2952, 3937-3942 or 3949-3954. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 

30 to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 

3 5 reverse or direct complements) according to the invention have numerous applications in a variety 
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of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 
full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 
5 In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 

3937-3942 or 3949-3954 or novel segments or parts of the nucleic acids of the invention are used as 
primers in expression assays that are well known in the art. In a particularly preferred embodiment, 
the nucleic acid sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954 or novel 
segments or parts of the nucleic acids provided herein are used in diagnostics for identifying 
1 0 expressed genes or, as well known in the art and exemplified by Vollrath et aL, Science 258:52-59 
(1992), as expressed sequence tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -984, 
1969-2952, 3937-3942 or 3949-3954 ; a polynucleotide comprising any of the full length protein 
1 5 coding sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954; and a polynucleotide 
comprising any of the nucleotide sequences of the mature protein coding sequences of SEQ ID 
NO: 1 -984, 1 969-2952, 3937-3942 or 3949-3954. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization 
conditions to (a) the complement of any one of the nucleotide sequences set forth in SEQ ID NO:l- 
20 984, 1969-2952, 3937-3942 or 3949-3954; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 
polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
25 amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in SEQ ID NO: 985-1968, 2953-3936, 3943- 
3948 or 3955-3960; or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides with biological activity that are encoded by (a) any of the 
30 polynucleotideshavinganucleotidesequencesetforthinSEQIDNO:l-984, 1969-2952, 3937- 
3942 or 3949-3954; or (b) polynucleotides that hybridize to the complement of the polynucleotides 
of (a) under stringent hybridization conditions. Biologically or immunologically active variants of 
any of the polypeptide sequences in the Sequence Listing, and "substantial equivalents" thereof 
(e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence 
35 identity) that preferably retain biological activity are also contemplated. The polypeptides of the 
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invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 

1 0 under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 

15 as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

20 using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mappihg of the human genome. 

25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

30 markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 
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In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
5 polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
10 interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
1 5 and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
20 monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 

25 that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g. , 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 

30 complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 

detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
35 administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 


5 


WO 01/57190 PCT/USO 1/04098 

symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene/protein expression or target protein 
5 activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Tables 2 and 9); for which they have 
a signature region (as set forth in Tables 3 and 10); or for which they have homology to a gene 
10 family (as set forth in Tables 4 and 1 1). If no homology is set forth for a sequence, then the 

polypeptides and polynucleotides of the present invention are useful for a variety of applications, 
as described herein, including use in arrays for detection. 


15 


4. DETAILED DESCRIPTION OF THE INVENTION 


4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
"an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
20 and/or immunologic activities of any naturally occurring polypeptide. According to the 

invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 

I 

Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 

25 appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 

30 polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 

complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
complementarity between the nucleic acid strands has significant effects on the efficiency and 

35 strength of the hybridization between the nucleic acid strands. 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated ceil types in an embryo or an aduit, including the germ cells. The term '"'germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of germ cells for the production of gametes. The term "primordial germ 
5 cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 

10 comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORE or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 

15 include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

20 sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 

25 provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

30 The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 

"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 

35 nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
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nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
5 be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
IDNOs:l-20. 

10 Probes may, for example, be used to determine whether specific mRNA molecules are 

present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 

15 Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et ah, 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 

20 information from the nucleic acid sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954. The sequence information can be a segment of any one of SEQ ID NO: 1-1-984, 
1969-2952, 3937-3942 or 3949-3954 that uniquely identifies or represents the sequence 
information of that sequence of SEQ ID NO:l-984, 1969-2952, 3937-3942 or 3949-3954. One 
such segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 

25 mer is folly matched in the human genome is 1 in 300. In the human genome, there are three 
billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers exist, there are 
300 times more twenty-mers than there are base pairs in a set of human chromosomes. Using the 
same analysis, the probability for a seventeen-mer to be fully matched in the human genome is 
approximately 1 in 5. When these segments are used in arrays for expression studies, fifteen- 

30 mer segments can be used. The probability that the fifteen-mer is folly matched in the expressed 
sequences is also approximately one in five because expressed sequences comprise less than 
approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 

35 with a single mismatch is calculated by multiplying the probability for a full match (l-f-4 25 ) times the 
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increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
5 The term "open reading frame/' ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 

1 0 linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 

1 5 differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment, 11 "portion," or "segment" is a stretch of amino 
acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 

20 preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 500 amino acids, more preferably less 
than 200 amino acids more preferably less than 150 amino acids and most preferably less than 
100 amino acids. Preferably the peptide is from about 5 to about 200 amino acids. To be active, 
any polypeptide must have sufficient length to display biological and/or immunological activity. 

25 The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 

have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 

30 length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 

35 protein portion may or may not include the initial methionine residue. The methionine residue 
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may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

The term "derivative" refers to polypeptides chemically modified by such techniques as 
5 ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 

10 occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 

15 or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 

20 prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 

25 another amino acid having similar structural and/or chemical properties, i.e. , conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 

30 neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 

glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
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insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
5 can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
for expression, scale up and the like in the host cells chosen for expression. For example, 

10 cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

15 polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 

preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

20 at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 
polypeptides present in their natural source. 

25 The term "recombinant," when used herein to refer to a polypeptide or protein, means 

that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 
defines a polypeptide or protein essentially free of native endogenous substances and 

30 unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

35 or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 


11 


WO 01/57190 PCT/USO 1/04098 

comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 
appropriate transcription initiation and termination sequences. Structural units intended for use 
5 in yeast or eukaryotic expression systems preferably include a leader sequence enabling 

extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

10 The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
to the DNA segment or synthetic gene to be expressed. This term also means host cells which 

15 have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 

20 The term "secreted" includes a protein that is transported across or through a membrane, 

including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 
"Secreted" proteins also include without limitation proteins that are transported across the 

25 membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 

proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P. A. and 
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 
16:27-55) 

30 Where desired, an expression vector may be designed to contain a "signal or leader 

sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood in the 

35 art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
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to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 
described herein in the examples. 
5 In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 

hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23 -base oligonucleotides). 

As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

10 sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 

15 substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 

20 by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention preferably have at least 80% sequence identity with a listed 

25 amino acid sequence, more preferably at least 85% sequence identity, more preferably at least 
90% sequence identity, more preferably at least 95% sequence identity, more preferably at least 
98% sequence identity and most preferably at least 98% idenity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 

30 sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% identity, more preferably at least about 85% identity, more 
preferably at least about 90% identity, and most preferably at least about 95% identity, more 
preferably at least 98% and most preferably at least about 99% identity. For the purposes of the 
present invention, sequences having substantially equivalent biological activity and substantially 

35 equivalent expression characteristics are considered substantially equivalent. For the purposes of 
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determining equivalence, truncation of the mature sequence (e.g., via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 
the Jotun Hein method (Hein, J. (1990) Methods EnzymoL 183:626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
5 hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
10 term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
15 using known UMFs as a target sequence or target motif with the computer-based systems 

s 

described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
20 marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 


4.2 NUCLEIC ACIDS OF THE INVENTION 

25 Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a 
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide sequence encoding the 

30 mature protein coding sequence of the polypeptides of any one of SEQ ID NO: 985-1968, 2953- 
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide that hybridizes under stringent conditions to (a) the 
complement of any of the nucleotides sequences of SEQ ID NO: 1 -984, 1 969-2952, 3937-3942 
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid sequences set forth 

35 in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960; (c) a 
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polynucleotide which is an allelic variant of any polynucleotide recited above; (d) a 
polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 
polypeptides of SEQ ID NO:985-1968, 2953-3936, 3943-3948 or 3955-3960. Domains of 
5 interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor-like 
polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic domains, or 
combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
10 domains. 

The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

1 5 The present invention also provides genes corresponding to the cDN A sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3' sequence can 

20 be obtained using methods known in the ait. For example, full length cDN A or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954 can be obtained by screening appropriate cDNA or genomic DNA libraries under suitable 
hybridization conditions using any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 

25 NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 may be used as the basis for suitable primer(s) 
that allow identification and/or amplification of genes in appropriate genomic DNA or cDNA 
libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDN A and genomic sequences) obtained from one or more public databases, such as 
30 dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
35 according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
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75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even more 
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited 
above. 

5 Included within the scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements thereof, which 
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater 
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 

10 20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 

polynucleotides of the invention) are contemplated. Probes capable of specifically hybridizing to 
a polynucleotide can differentiate polynucleotide sequences of the invention from other 
polynucleotide sequences in the same family of genes or can differentiate human genes from 
genes of other species, and are preferably based on unique nucleotide sequences. 

1 5 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 -984, 
1969-2952, 3937-3942 or 3949-3954, a representative fragment thereof, or a nucleotide sequence at 
least 90% identical, preferably 95% identical, to SEQ ID NO: 1-984, 1 969-2952, 3937-3942 or 

20 3949-3954 with a sequence from another isolate of the same species. Furthermore, to accommodate 
codon variability, the invention includes nucleic acid molecules coding for the same amino acid 
sequences as do the specific ORFs disclosed herein. In other words, in the coding region of an 
ORF, substitution of one codon for another codon that encodes the same amino acid is expressly 
contemplated. 

25 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, can be obtained by searching a 
database using an algorithm or a program. Preferably, a BLAST which stands for Basic Local 
Alignment Search Tool is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 
36 290-300 (1993) and Altschul SJF. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a 

30 FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 
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The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

5 The nucleic acid sequences of the invention are further directed to sequences which 

encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 

10 encoding the amino acid sequence variants are preferably constructed by mutating the 

polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 

1 5 hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

20 hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 
preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

25 In a preferred method, polynucleotides encoding the novel amino acid sequences are 

changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 

30 those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 

35 slightly in sequence from the corresponding region in the template DNA can generate the desired 
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amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the piasmid and this 
gives a polynucleotide encoding the desired amino acid variant. 
5 A further technique for generating amino acid variants is the cassette mutagenesis 

technique described in Wells et al, Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et aL, supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 

10 amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

1 5 domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 

20 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-2952, 3937- 
3942 or 3949-3954, or functional equivalents thereof, may be used to generate recombinant 
DNA molecules that direct the expression of that nucleic acid, or a functional equivalent thereof, 

25 in appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 

30 nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 

35 selectable marker for the host cell. Vectors according to the invention include expression 
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vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
5 having any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949- 
3954or a fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or viral 
vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID NO: 1-984, 
1969-2952, 3937-3942 or 3949-3954 or a fragment thereof is inserted, in a forward or reverse 

1 0 orientation. In the case of a vector comprising one of the ORFs of the present invention, the 
vector may further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORE. Large numbers of suitable vectors and promoters are known to those of skill 
in the art and are commercially available for generating the recombinant constructs of the present 
invention. The following vectors are provided by way of example. Bacterial: pBs, phagescript, 

1 5 PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, 
pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, 
PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al., 

20 Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 

25 or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 

30 lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 

kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 

35 and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
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transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
5 preferably, a leader sequence capable of directing secretion of translated protein into the 

periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 

10 sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 

15 within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 

20 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 

25 appropriate means {e.g. , temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centxifiigation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. For 
example, as described in Fan et al., Nat Biotech. 17:870-872 (1999), incorporated herein by 

30 reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 
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4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949^3954, or fragments, analogs or 

5 derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 

complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding 
strand of a double-stranded cDNA molecule or complementary to an mRNA sequence. In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire coding 

10 strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, homologs, 

derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID NO: 
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 

15 of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5 1 and 3* sequences which flank the coding region that are not 

20 translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954), antisense nucleic acids of the invention can be 
designed according to the rules of Watson and Crick or Hoogsteen base pairing. The antisense 
nucleic acid molecule can be complementary to the entire coding region of a mRNA, but more 

25 preferably is an oligonucleotide that is antisense to only a portion of the coding or noncoding 
region of a mRNA. For example, the antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of a mRNA. An antisense oligonucleotide can be, for 
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic 
acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions 

30 using procedures known in the art. For example, an antisense nucleic acid (e.g. , an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to increase the biological stability of the molecules or to 
increase the physical stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
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Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouraciI, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
5 inosine, N6-isopentenyladenine, 1 -methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyiadenine, 2 -methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methyl guanine, 5 -methy laminomethy luracil, 5-methoxyaminomethy I-2-thiouracil, 
beta-D-mannosylqueosine, S'-methoxycarboxymethy luracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
10 queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouraciI, 4-thiouracil, 5-methyluracil, 

uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation {i.e., RNA transcribed from the 

15 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

20 protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

25 antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 

30 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

35 double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
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strands run parallel to each other (Gaultier et aL (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2 , -o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et aL (1987) 
FEES Lett 215: 327-330). 

5 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 

1 0 Thus, ribozymes {e.g., hammerhead ribozymes (described in Haselhoff and Gerlach ( 1 988) 

Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein {i.e. , SEQ ID NO: 1 - 
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a Tetrahymena L-19 

1 5 I VS RNA can be constructed in which the nucleotide sequence of the active site is 

complementary to the nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., 
Cech et al. U.S. Pat. No. 4,987,071; and Cech et aL U.S. Pat. No. 5,1 16,742. Alternatively, 
SECX mRNA can be used to select a catalytic RNA having a specific ribonuclease activity from 
a pool of RNA molecules. See, e.g., Bartel et aL, (1993) Science 261:1411-1418. 

20 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region {e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et aL (1992) Ann. N. Y. Acad. Set 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

25 In various embodiments, the nucleic acids of the invention can be modified at the base 

moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et aL (1996) Bioorg Med 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 

30 mimics, e.g. , DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 

35 Perry-O'Keefe et aL (1996) PNAS 93: 14670-675. 
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PNAs of the invention can be used in therapeutic and diagnostic applications. For 

example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 

gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 

PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 

5 gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 

combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 

primers for DNA sequence and hybridization (Hyrup et aL (1996), above; Perry-O'Keefe (1996), 

above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

1 0 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 

15 portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et aL (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

20 phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5 ! -(4-methoxytrityl)amino-5 r -deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et aL (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5 f PNA segment and a 3' 
DNA segment (Finn et aL (1996) above). Alternatively, chimeric molecules can be synthesized 

25 with a 5' DNA segment and a 3' PNA segment. See, Petersen et aL (1975) BioorgMed Chem 
LettS: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger et aL, 1989, Proc. NatL Acad. Set U.S.A. 86:6553-6556; 
30 Lemaitre etaL, 1987, Proc. NatL Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). Inaddition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
aL, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
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peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 


4.5 HOSTS 

5 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

1 0 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of ceils to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

1 5 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 

20 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

25 The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 

eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 

30 polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 

35 COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and A subtilis. 
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The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
5 RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 

expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 

10 protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 

1 5 from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 

HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 

20 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 

25 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
30 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
35 may be necessary to modify the protein produced therein, for example by phosphorylation or 
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glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

10 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 

15 protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the * 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 

20 of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 

Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 

25 the use of one or more selectable marker genes that are contiguous with the targeting DNA, 

allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 

30 selectable marker flanks the targeting sequence, and such that a correct homologous 

recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance with 

this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 

Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; Internationa] Application No. 

PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

5 PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is incorporated by reference 

herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

10 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-1968, 2953-3936, 
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the corresponding full 
length or mature protein. Polypeptides of the invention also include polypeptides preferably with 
biological or immunological activity that are encoded by: (a) a polynucleotide having any one of 

1 5 the nucleotide sequences set forth in SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or 
(b) polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO: 985- 
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 

20 amino acid sequences set forth as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960 
or the corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., at 
least about 65%, at least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically at least about 
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99%, 

25 sequence identity that retain biological activity. Polypeptides encoded by allelic variants may 
have a similar, increased, or decreased activity compared to polypeptides comprising SEQ ID 
NO: 985-1968, 2953-3936, 3943-3948 or 3955-3960. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 

30 be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 
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The present invention also provides both full-length and mature forms (for example, 

without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 

sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 

5 polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 

of the protein is also determinable from the amino acid sequence of the full-length form. Where 

proteins of the present invention are membrane bound, soluble forms of the proteins are also 

provided. In such forms, part or all of the regions causing the proteins to be membrane bound 

are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

10 Protein compositions of the preseot invention may further comprise an acceptable carrier, 

such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 

15 nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

20 sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

25 example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 

30 ceil is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 
which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 

35 or proteins of the present invention. 
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The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
5 expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

10 In an alternative method, the polypeptide or protein is purified from bacterial cells which 

naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 

1 5 and immuno-affinity chromatography. See, e.g. , Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 

20 domains. 

The purified polypeptides can be used in in vifro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 

25 activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 

30 cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 

specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955- 
3960. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
35 by somatic or germ cells containing a nucleotide sequence encoding the protein. 
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The proteins provided herein also include proteins characterized by amino acid sequences 

similar to those of purified proteins but into which modification are naturally provided or 

deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 

made by those skilled in the art using known techniques. Modifications of interest in the protein 

5 sequences may include the alteration, substitution, replacement, insertion or deletion of a 

selected amino acid residue in the coding sequence. For example, one or more of the cysteine 

residues may be deleted or replaced with another amino acid to alter the conformation of the 

molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 

well knovwi to those skilled in the art (see, e.g., U.S. Pat No. 4,518,584). Preferably, such 

1 0 alteration, substitution, replacement, insertion or deletion retains the desired activity of the 

protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 

15 importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
methodologies may also be easily made by those skilled in the art given the disclosures herein. 

20 Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 

25 (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 

30 culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 

35 heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
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hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or imniunoaffmity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
5 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
1 0 available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
15 homogeneous isolated recombinant protein. The protein thus purified is substantially free of 

other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 

20 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

25 provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 

30 steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 
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Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et aL, Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
5 University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al. Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 

10 (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
1 05-3 1 (1982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 

15 Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 

protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protein the polypeptide according to the invention can 

20 correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other 

25 polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 
the invention operably linked to the extracellular domain of a second protein. 
In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
30 sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 
the polypeptide sequences according to the invention comprise one or more domains fused to 
sequences derived from a member of the immunoglobulin protein family. The immunoglobulin 
35 fusion proteins of the invention can be incorporated into pharmaceutical compositions and 
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administered to a subject to inhibit an interaction between a ligand and a protein of the invention 
on the surface of a cell, to thereby suppress signal transduction in vivo. The immunoglobulin 
fusion proteins can be used to affect the bioavailability of a cognate ligand. Inhibition of the 
ligand/protein interaction may be useful therapeutically for both the treatment of proliferative 
5 and differentiate disorders, e,g, cancer as well as modulating {e.g., promoting or inhibiting) 
cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays to 
identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 
A chimeric or fusion protein of the invention can be produced by standard recombinant 

10 DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 
appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 

1 5 be synthesized by conventional techniques including automated DNA synthesizers. 

Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 

20 Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety {e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
in-frame to the protein of the invention. 

25 4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 

30 appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 

35 American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
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the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

1 0 Other methods inhibiting expression of a protein include the introduction of antisense 

molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 
inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

1 5 The present invention still further provides cells genetically engineered in vivo to express the 

polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

20 Knowledge of DN A sequences provided by the invention allows for modification of cells to 

. permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 

25 operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International PublicationNo. WO 92/20808, and PCT 
International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 

3 0 intron DNA may be inserted along with the heterologous promoter DNA . If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

3 5 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
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be replaced by homologous recombination. As described herein, gene targeting can be used to 
replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
5 regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylatiori signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 

1 0 which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 

1 5 targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 

20 of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 

25 not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 

30 U.S.PatentNo. 5,578,461 to Sherwinet al.; International Application No. PCT/US92/09627 
(WO93/09222)by Selden et al.; and International AppUcationNo. PCT/US90/06436 
(WO91/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
5 control of exogenous or endogenous promoter elements, are known as transgenic animals. 

Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout' 1 animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

10 processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 

15 polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 
replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 

20 known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 

25 polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in v/vo, one or more genes provided by the invention are either over expressed or 
inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 

30 control of exogenous or endogenous promoter elements, are known as transgenic animals. 

Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 

35 processes, and preferably in disease states. Transgenic animals are useful as model systems to 
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identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
5 invention promoter is either activated or inactivated to alter the level of expression of the 

polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
10 confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

The polynucleotides and proteins of the present invention are expected to exhibit one or 
more of the uses or biological activities (including those associated with assays cited herein) 

1 5 identified herein. Uses or activities described for proteins of the present invention may be 

provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 
mechanism underlying the particular condition or pathology will dictate whether the 
polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 

20 inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 

25 target gene products, either at the level of target gene/protein expression or target protein 

activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 

30 helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
or in one of the other physiological pathways described herein. 

35 4.10.1 RESEARCH USES AND UTILITIES 
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The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
protein for analysis, characterization or therapeutic use; as markers for tissues in which the 
corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
5 tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 

10 sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 
an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 
polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 

15 example, in a receptor-Iigand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 
the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 

20 determine biological activity, including in a panel of multiple proteins for high-throughput 

screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 

25 development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

30 Methods for performing the uses listed above are well known to those skilled in the art. 

References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

35 


39 
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4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 


4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 

15 activity or may induce production of other cytokines in certain cell populations. A 

polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of therapeutic compositions of the present 

20 invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 

25 in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al, J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli etal., Cellular Immunology 133:327-341, 1991; Bertagnolli, 

30 et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
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and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
5 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 

Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al, Nature 336:690-692, 1988; 
Greenberger et aL, Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6— Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 

10 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 

U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11 --Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1 99 1 ; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 

15 J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 

20 Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger etal., Eur. J. Immun. 11:405-411, 1981; Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

25 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 
cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 

30 germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 

35 proteins which currently must be obtained from non-human sources or donors, implantation of 


41 


WO 01/57190 PCT/USO 1/04098 

cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 

tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 

cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

5 It is contemplated that multiple different exogenous growth factors and/or cytokines may 

be administered in combination with the polypeptide of the invention to achieve the desired 

effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 

specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Fit- 

3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

10 inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
these cells in culture will facilitate the production of large quantities of mature cells. Techniques 

15 for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 

20 layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
autocrine expression of the polypeptide of the invention. This will allow for generation of 

25 undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
identification of differentially expressed genes in stem cell populations that regulate stem cell 

30 proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 

35 genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
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of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
5 to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

10 promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

15 accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 

20 sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 

25 Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

30 biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 
to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

35 growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
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traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 

treat consequent myelo-suppression; in supporting the growth and proliferation of 

megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

various platelet disorders such as thrombocytopenia, and generally for use in place of or 

5 complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 

hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 

hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 

those usually treated with transplantation, including, without limitation, aplastic anemia and 

paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 

10 post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 

transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 

as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

Suitable assays for proliferation and differentiation of various hematopoietic lines are 

1 5 cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et aL Cellular Biology 15:141-151, 1995; Keller et ah, Molecular 
and Cellular Biology 13:473-486, 1993; McClanahan et aL, Blood 81:2903-2915, 1993. 

20 Assays for stem cell survival and differentiation (which will identify, among others, 

proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et aL eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et aL, 
Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 

25 with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et aL eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
aL, Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et aL eds. Vol pp. 1-21, 
Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 

30 stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 

Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. L Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 


35 4.10.6 TISSUE GROWTH ACTIVITY 
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A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 

Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 
tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of neural 

cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 

involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 

5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 

peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 

system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 

10 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 

resulting from chemotherapy or other medical therapies may also be treatable using a 

composition of the invention. 

Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 

15 insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 

regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 

endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 

20 desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 

to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 

regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 

conditions resulting from systemic cytokine damage. 

25 A composition of the present invention may also be useful for promoting or inhibiting 

differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 

growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 

30 International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 

Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 

WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
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Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 


4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

5 A polypeptide of the present invention may also exhibit immune stimulating or immune 

suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

10 proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 
treatable using a protein of the present invention, including infections by HTV, hepatitis viruses, 

15 herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 

20 rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 

autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 

25 venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 

30 suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 
1998), skin prick test (Hoffinann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
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test (Vohr et aL, Arch. ToxocoL 73: 501-9), and murine local lymph node assay (Kimber et ah, 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
5 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 

10 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

15 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft- versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

20 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 

25 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

30 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
aL, Science 257:789-792 (1992) and Turka et aL, Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

35 compositions of the invention on the development of that disease. 
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Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 

5 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
10 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 

15 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 

20 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 

25 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

30 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

35 MHC class I alpha chain protein and P2 microglobulin protein or an MHC class II alpha chain 
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protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 

proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 

with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 

cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

5 an antisense construct which blocks expression of an MHC class II associated protein, such as 

the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

* of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 

subject may be sufficient to overcome tumor-specific tolerance in the subject. 

1 0 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

1 5 Wiley-Interscience (Chapter 3 , In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. NatL Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al, J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 

20 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 

Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 1 44:3028-3033, 1 990; and Assays for B cell function: In vitro antibody production, 

25 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 

30 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 

35 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
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et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgadoret 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al, Journal of 
5 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 

10 13:795-808, 1 992; Gorczyca et al., Leukemia 7:659-670, 1 993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al, Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 

15 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al, 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBEN ACTIVITY 

20 A polypeptide of the present invention may also exhibit activin- or inhibin-related 

activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 

25 alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 

30 a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 
5 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et aL, Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 


4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 

10 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 

15 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 

20 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 

Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 

25 Assays for chemotactic activity (which will identify proteins that induce or prevent 

chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 

30 M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
•6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et aL J. of Immunol. 153:1762-1768, 1994. 
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4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 

5 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

1 0 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

15 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
20 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

25 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

30 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

3 5 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
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bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 

carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 

kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 

neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

5 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 

tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 

, hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 

inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

10 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

1 5 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti -cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

20 with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-2 13), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 

25 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

30 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

35 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
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In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vifro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch21), 
5 tumor systems in nude mice as described in Giovanella et ah, J. Natl. Can. Inst, 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al. 3 Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 189-97 (1999) and Li et aL, 
10 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
1 5 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
20 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
25 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
30 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28. 1 - 7.28.22), Takai et al., Proc. 
Nad. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
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By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 
5 Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 

partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification' 1 Murray P. Deutscher (ed) Methods in Enzymology Vol. 1 82 (1990) Academic 
10 Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon- 14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 


15 4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 

20 utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 

25 diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

30 comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 

35 fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
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screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 252:63-68 (1998). 
5 Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 

organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 

10 For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al, Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):114-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 
Identification of modulators through use of the various libraries described herein permits 

1 5 modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

20 The binding molecules thus identified may be complexed with toxins, e.g., ricin or 

cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

25 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 

30 expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 

35 that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 
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Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 

ligands, or cocktails of ligands to two cells populations that are genetically identical except for 

the expression of the receptor of the invention: one cell population expresses the receptor of the 

invention whereas the other does not. The response of the two cell populations to the addition of 

5 ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

polypeptide of the invention in cells and assayed for an autocrine response to identify potential 

ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 

in the art can be used to identify binding partner polypeptides, including, (1) organic and 

inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

10 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is thea incubated 

1 5 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

20 4 J0.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 

25 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

30 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

35 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
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arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
5 intrauterine infections. 


4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
10 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

15 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 

20 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 

25 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
30 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

35 tuberculosis, syphilis; 


59 


WO 01/57190 PCT/USO 1/04098 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 

injured as a result of a degenerative process including but not limited to degeneration associated 

with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 

sclerosis; 

5 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 
10 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

15 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

20 system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

25 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g. , 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

30 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
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assessing the physical manifestation of motor neuron disorder, e.g., weakness, motor neuron 

conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 

invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 

5 trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 

1 0 poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 

1 5 activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 

20 effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

25 (including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 

30 as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 
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The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
5 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

10 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

15 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

20 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 

25 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

30 4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
35 Induction of the disease can be caused by a single injection, generally intradermally, of a 
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suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01|ng/kg to 100 mg/kg of body weight, with 
the preferred dose being about O.l^ig/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutical^ acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient 
The preparation of such solutions is within the skill of the art. 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
5 including without limitation from recombinant and non-recombinant sources and including 

antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

10 fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutical^ acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

15 M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, 
IL-13, IL-1 4, IL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 

20 factor (PDGF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 

25 composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 

30 hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 
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As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
5 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, eg., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
10 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

15 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 

20 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 

25 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factors), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
30 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
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ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
5 a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
10 afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
1 5 ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

20 Pharmaceutical compositions for use in accordance with the present invention thus may 

be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutical^. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 

25 dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 

lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 

30 the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 

35 soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
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pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
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talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 
5 Pharmaceutical preparations which can be used orally include push-fit capsules made of 

gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
10 suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
15 invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
20 an inhaler or insufflator may be formulated containing a powder mix of the compound and a 

suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 
25 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
30 vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
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solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfbxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 
or excipients. Examples of such carriers or excipients include but are not limited to calcium 
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carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 

provided as salts with pharmaceutically compatible counter ions. Such pharmaceutical ly 

acceptable base addition salts are those salts which retain the biological effectiveness and 

5 properties of the free acids and which are obtained by reaction with inorganic or organic bases 

such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 

the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 

1 0 protein(s) or other active ingredient(s) of present invention along with protein or peptide 

antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 

15 those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 

20 pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 

25 lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

30 The amount of protein or other active ingredient of the present invention in the 

pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 

35 attending physician will administer low doses of protein or other active ingredient of the present 
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invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
5 contain about 0.01 jig to about 100 mg (preferably about 0.1 jig to about 10 mg, more preferably 
about 0.1 fig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 

10 composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 

15 described above, may alternatively or additionally, be administered simultaneously or 

sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 

20 capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 

25 may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 

30 aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

35 glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
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In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
5 (including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 

10 The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1 - 1 0 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

15 compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-(3), and 
insulin-like growth factor (IGF). 

20 The therapeutic compositions are also presently valuable for veterinary applications. 

Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 

25 modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 

30 other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
35 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
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mammalian subject. Polynucleotides of the invention may also be administered by other known 

methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 

the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

proteins of the present invention in order to proliferate or to produce a desired effect on or 

5 activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 


4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 

10 intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 

15 appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC 5 o as determined in cell culture {i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 

20 Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 

25 population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds lies preferably within a range 

30 of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 

vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et aL, 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 

35 individually to provide plasma levels of the active moiety which are sufficient to maintain the 
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desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 
5 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 
10 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 ng/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 jag/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

1 5 The amount of composition administered will, of course, be dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

20 The compositions may, if desired, be presented in a pack or dispenser device which may 

contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 

25 appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

30 immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b, F ab > and F w 
fragments, and an F a b expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 

35 by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
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such as IgGi, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The fiill-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO:985, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
may be generated by any method well known in the art, including, for example, the Kyte 
Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat Acad Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
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Manual, Harlow E 5 and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 


5.13.1 Polyclonal Antibodies 

5 For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a . 

10 recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 

15 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

20 synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 

25 target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

30 5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody' 1 (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
35 antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
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binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by KohJer and Milstein, Nature , 256:495 (1975). In a hybridoma method, a mouse, 
5 hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 

10 are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 

15 Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of die unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ( M HAT 

20 medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 

25 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol, 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

30 The culture medium in which the hybridoma cells are cultured can then be assayed for 

the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 

35 art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
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Scatchard analysis of Munson and Pollard, Anal. Biochem. , 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
5 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco f s Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

10 example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

15 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

20 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 

example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368 , 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 

25 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

30 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , F(ab f )2 or other antigen- 

35 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
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immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et aL, 
Nature , 321:522-525 (1986); Riechmann et aL, Nature , 332:323-327 (1988); Verhoeyen et aL, 
Science , 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 
5 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 

10 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et aL, 1986; Riechmann et aL, 1988; and Presta, Curr. Op. Struct. BioL. 

15 2:593-596(1992)). 


5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 

20 genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et aL, 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et aL, 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 

25 antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et aL, 1 983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et aL, 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

30 including phage display libraries (Hoogenboom and Winter, J. Mol. BioL , 227:381 (1991); 
Marks et aL, J. MoL BioL. 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 

35 in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
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is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et aL (Bio/Technology JO, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)): Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 
5 Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

10 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

1 5 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 

20 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 

25 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

30 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,9 16,77 1 . It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

35 an expression vector containing a nucleotide sequence encoding a light chain into another 
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mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 
5 immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5,13.4 F ab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
10 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F a b fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
15 may be produced by techniques known in the art including, but not limited to: (i) an F( a t>')2 

fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated 
by reducing the disulfide bridges of an fragment; (iii) an F a b fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

20 5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

25 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 

30 potential mixture of ten different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker etal, 1991 EMBO.J., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 

35 combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
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preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al., Methods in Enzvmology , 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab') 2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab') 2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 
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Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1 547-1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Pmr. Natl Acad. Sci.USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the V H and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD16) so as to focus cellular 
defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
. No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
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can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5 5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 

1 0 internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., I Exp Med., 176: 1191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifimctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

15 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
20 cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. En2ymatically active toxins and fragments thereof that can be used include 

25 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

30 radionuclides are available for the production of radioconjugated antibodies. Examples include 
2,2 Bi, m I, 13I In, 90 Y, and 186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyI-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 

35 active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
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compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
5 Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
10 administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent. 


4.14 COMPUTER READABLE SEQUENCES 

15 In one application of this embodiment, a nucleotide sequence of the present invention can 

be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 

20 and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 

25 presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 

30 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 

35 Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
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formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at least 95% 
5 identical to any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
3949-3954 in computer readable form, a skilled artisan can routinely access the sequence 
information for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to access sequence information provided in a computer readable medium. The 
examples which follow demonstrate how software which implements the BLAST (Altschul et 
10 al., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useful metabolites. 

15 As used herein, M a computer-based system" refers to the hardware means, software 

means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 

20 computer-based systems are suitable for use in the present invention. As stated above, the 

computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 

25 invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 

30 fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 

35 skilled artisan can readily recognize that any one of the available algorithms or implementing 
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software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
5 present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

10 As used herein, "a target structural motif," or "target motif," refers to any rationally 

selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 

15 to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
20 control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA, 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et aL, Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
25 et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
30 Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
35 one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
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acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
5 for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
10 detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 
15 In detail, such methods comprise incubating a test sample with one or more of the 

antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
20 employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T. 5 An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
25 Amsterdam, The Netherlands (1 986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1 985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
30 sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 
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In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
5 invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 

10 another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 

15 contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 

20 established kit formats which are well known in the art. 


4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
25 invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

30 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORP corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 
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1-984, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the polypeptide 

encoded by the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORJF of the present 

invention, or nucleic acid of the invention; and 

5 (b) determining whether the agent binds to said protein or said nucleic acid. 

In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 

the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 

the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

10 to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound with a polypeptide of the 

invention for a time sufficient to form a polypeptide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

1 5 polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 

sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 

receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

20 sequence expression, so that if a polypeptide/compound complex is detected, a compound that 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 

activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 

activity observed in the absence of the compound). Alternatively, compounds identified via such 

25 methods can include compounds which modulate the expression of a polynucleotide of the 

invention (that is, increase or decrease expression relative to expression levels observed in the 

absence of the compound). Compounds, such as compounds identified via the methods of the 

invention, can be tested using standard assays well known to those of skill in the art for their 

ability to modulate activity/expression. 

30 The agents screened in the above assay can be, but are not limited to, peptides, 

carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 

and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 

35 the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
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As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
5 antipeptide peptides, for example see Hurby et aL, Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et aL, Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

In addition to the foregoing, one class of agents of the present invention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 

10 of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 

15 by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 

ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 

20 Lee et aL, NucL Acids Res. 6:3073 (1979); Cooney et aL, Science 241:456 (1988); and Dervan et 
aL, Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 

25 polypeptide. Both techniques have been demonstrated to be effective in model systems. 

Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 

30 present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
35 hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
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hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the 
corresponding gene is only expressed in a limited number of tissues, a hybridization probe 
derived from of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 
5 3949-3954 can be used as an indicator of the presence of RNA of cell type of such a tissue in a 
sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 

10 PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 

15 are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 

20 chromosome using well known genetic and/or chromosomal mapping techniques. These 

techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 

25 Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 

30 predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 
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4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 
5 Support bound oligonucleotides may be prepared by any of the methods known to those of 

skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et al t 1985; Dahlen et al., 1987; Morrissey & Collins, (1989) Mol. Cell 

10 Probes3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1988; 1989);all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1 994) Proc. Natl. Acad. ScL USA 91 (8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 

1 5 streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
20 Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol of DNA 
25 (Rasmussene/a/., (1991) Anal. Biochem. 198(1) 138-42). 

The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenet al., (1991). In this technology, a phosphoramidate bond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 65 13-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
30 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 

grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 
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More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 1 0 min. at 95°C and cooling on ice for 10 min. Ice-cold 0. 1 M 1 -methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 10 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 
5 Carbodiimide 0.2 M l-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), dissolved in 

1 0 mM 1 -Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

10 It is contemplated that a further suitable method for use with the present invention is that 

described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3 -reagent through the phosphate group by a covalent phosphodi ester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 

1 5 nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotection may be 

20 employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodorer al (1991) Science 25 1(4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1 991) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 

25 To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 

requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al, (1994) PNAS USA 91(11) 5022-6, incorporated 

30 herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5'-protected7V-acyl-deoxy nucleoside phosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 

3 5 generated in this manner. 
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4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, rnicrodissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRN A without any amplification steps. For example, Sambrook et al (1989) describes 
5 three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in Ml 3, plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplification methods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
1 0 prepared in 2-5 00 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1989), shearing by ultrasound and NaOH treatment. 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
1 5 Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intennediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 
fragmentationmethods. 

20 One particularly suitable way for fragmenting DNA is contemplated to be that using the two 

base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

25 The restriction endonuclease Cv/JI normally cleaves the recognition sequence PuGCPy 

between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments form the small 
. moleculepUC19 (2688 base pairs). Fitzgerald etal(l992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/JI* * digest of pUC 1 9 that was size 

30 fractionated by a rapid gel filtration method and directly Iigated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that Cv/JI* * restricts pyGCPy and 
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 

35 agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
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ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
5 achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturationof the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 

1 0 Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By offset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 

1 5 may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 

20 prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membrane. 

Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 

25 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
30 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
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variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 
All references cited within the body of the instant specification are hereby incorporated by 
5 reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
1 0 human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDN A libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
1 5 into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
20 Amplification of cDN A Ends) was performed to further extend the sequence in the 5 ' direction. 

5.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 1969-2951, 
and 3949-3954 were assembled using an EST sequence as a seed. Then a recursive algorithm was 

25 used to extend the seed EST into an extended assemblage, by pulling additional sequences from 
different databases (i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 
1 1 4, and UniGene version 101) that belong to this assemblage. The algorithm terminated when 
there was no additional sequences from the above databases that would extend the assemblage. 
Inclusion of component sequences into the assemblage was based on a BLASTN hit to the 

30 extending assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Tables 6 and 8 sets forth the novel predicted polypeptides (including proteins) encoded by 
the novel polynucleotides (SEQ ID NO:2953-3936, and 3949-3954) of the present invention, and 
their corresponding nucleotide locations to each of SEQ ID NO: 2953-3936 and 3955-3960. Tables 
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6 and 8 also indicates the method by which the polypeptide was predicted. Method A refers to a 
polypeptide obtained by using a software program called FASTY (available from 
http://fasta>bioch.virginia,edu ) which selects a polypeptide based on a comparison of the translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 1 83:63-98 
5 ( 1 990), herein incorporated by reference). Method B refers to a polypeptide obtained by using a 
software program called GenScan for human/vertebrate sequences (available from Stanford 
University, Office of Technology Licensing) that predicts the polypeptide based on a probabilistic 
model of gene structure/compositional properties (C. s Burge and S. Karlin, J. Mol. Biol., 268:78-94 
(1 997), incorporated herein by reference). Method C refers to a polypeptide obtained by using a 
1 0 Hyseq proprietary software program that translates the novel polynucleotide and its complementary 
strand into six possible amino acid sequences (forward and reverse frames) and chooses the 
polypeptide with the longest open reading frame. 

5.3 EXAMPLE 3 
Novel Nucleic Acids 

1 5 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), full length gene cDN A sequences 

and their corresponding protein sequences were generated from the assemblage. Any frame shifts 
and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 

20 ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences are shown in the 
Sequence Listing as SEQ ID NO: 1-351 . The amino acids are SEQ ID NO:985-1335. 
Table 1 shows the various tissue sources of SEQ ID NO: 1-351. 

The nearest neighbor results for SEQ ID NO: 1-351 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 

25 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 

homologue for SEQ ID NO: 1-351 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 1-351 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

30 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 


98 


WO 01/57190 PCT/US01/04098 
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VL1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokary otic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 7 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 

5.4 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 17, gb pri 1 17, 
UniGene version 1 1 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hy seq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 352-766. The corresponding 
amino acids are SEQ ID NO: 1336-1750. 

Table 1 shows the various tissue sources of SEQ ID NO: 352-766. 

The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 352-766 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs with 
identifiable functions for SEQ ID NO: 352-766 are shown in Table 2 below. 
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Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
5 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
1 0 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
15 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
20 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5 EXAMPLE 5 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
25 sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank(i.e., dbEST version 1 18, gb pri 1 1 8, 
UniGene version 118, Genpept release 118). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
30 ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The corresponding 
amino acid sequences axe SEQ ID NO: 1751-1914. 

Table 1 shows the various tissue sources of SEQ ID NO: 767-930. 
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The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version 
2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 release 
21(Derwent), using BLAST algorithm. The nearest neighbor result showed the homologs for 
SEQ ID NO: 767-930 from Genpept. The translated amino acid sequences for which the nucleic 
5 acid sequence encodes are shown in the Sequence Listing. The homologues with identifiable 
functions for SEQ ID NO: 767-930 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
10 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
15 the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

20 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

25 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.6 EXAMPLE 6 
Novel Nucleic Acids 

30 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 1 8, gb pri 1 1 8, 
UniGene version 1 1 8, Genpept release 118). Other computer programs which may have been used 
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in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The fulHength nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 93 1 -965. The corresponding 
amino acid sequences are shown in SEQ ID NO: 1915-1 949. 
5 Table 1 shows the various tissue sources of SEQ ID NO: 93 1 -965. 

The nearest neighbor results for SEQ ID NO: 931-965 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 931-965 from Genpept . The translated amino acid sequences for 
10 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 931-965 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
1 5 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
20 the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

25 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
- cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

30 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.7 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 19, gb pri 1 19, 
5 UniGene version 1 1 9, Genpept release 1 1 9). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:966~974. The corresponding 
amino acid sequences are SEQ ID NO: 1950-1 958. 
1 0 Table 1 shows the various tissue sources of SEQ ID NO: 966-974. 

The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 966-974 from Genpept . The translated amino acid sequences for 
15 which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 
with identifiable functions for SEQ ID NO: 966-974 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
20 signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
25 the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 

30 for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

35 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 
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each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5.8 EXAMPLE 8 
Novel Nucleic Acids 

5 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb pri 120, 
UniGene version 120, Genpept release 1 20) . Other computer programs which may have been used 

10 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:975-984. The corresponding 
amino acid sequences are SEQ ID NO:1959-1968. 

Table 1 shows the various tissue sources of SEQ ID NO: 975-984. 

1 5 The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 21, 2000 
release (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 975-984 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

20 with identifiable functions for SEQ ID NO: 975-984 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al. 5 J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

25 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 

30 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network Signal? VL1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
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disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 

publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 

cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 

5 was obtained for the polypeptide sequences. Table 7 shows the position of the signal peptide in 

each of the polypeptides and the maximum score and mean score associated with that signal 

peptide. 

5.9 EXAMPLE 9 
Novel Nucleic Acids 

10 Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 1 20, gb pri 120, 
UniGene version 120, Genpept release 1 20). Other computer programs which may have been used 

15 in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS:3937-3942. The 
correspondingpeptide sequence is SEQ ID NO: 3943-3948. 

Table 1 shows the various tissue sources of SEQ ID NO: 3937-3942. 

20 The nearest neighbor results for SEQ ID NO: 3937-3942 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 120 and Geneseq October 12, 2000 
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the closest 
homologue for SEQ ID NO: 3937-3942 from Genpept . The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologs 

25 with identifiable functions for SEQ ID NO: 3937-3942 are shown in Table 9 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol, Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 10 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 

30 the eMatrix p-vaiue(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 1 1 shows the name of 
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the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI ,1 program (from 
5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
10 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 12 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 


15 Tables 5 and 1 3 are correlation tables of all of the sequences and the SEQ ID NOS. 


TABLE 1 


Tissue Origin 

RNA 
Source 

Library 
Name 

SEQ ID NOS: 

lung 



3 11 25 49 65 75 114 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 

adult brain 

GIBCO 

AB3001 

1 3 12-13 16 22-24 28-29 41 48 58 65 78 
82 89-90 94 97 103 112 114-115 117 120 
122 130-131 168 181 184 186-187 189- 
190 198 208 216 247 249 259 270 277 
297 301 308 3 12 3 14 321 333 348 374 
396 403 406 410 412 416-417 420 423 
426-427 431 456 474 481 484-485 488 
498 500 508-509 530 549 553 558 563- 
564 583 596 602-603 608 612 621-622 
624 643 650 674 699 71 1 736 738-739 
753 770 779-780 785-786 802-803 816 
822 839 842 848 859 861 871 893-894 
897 900 903 925 954 958 967 969 

adult brain 

GIBCO 

ABD003 

3 19 21-25 28-29 31 33-34 37 39 41 46-48 
53 58 63-64 66 72 78 80 99 103 109-1 10 
112 114 118 120-124 126 132-133 135 
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139 143 146 148-149 159 163 168 174 
176 179-180 184-185 1 88- i 90 202 208- 
209 216-217 221 223 230 234-235 240 
244 249 251 253 255 258-259 263 269- 
270 277 282 285-286 290 294-295 297 
301-302 304-305 307-308 311-312 314 
320 329 333 335-336 342 344 346 349 
354 358 365 370 373-374 377 380 382- 
383 388 394-396 399 401-402 406 409- 
410 413 416 420-421 425 428 430-431 
436-437 442 456 462 464 466-467 474 
484 486 495-496 500-501 506 508-509 
519 530 537 542 549 561-562 564 572 
574 577-578 580-583 586-587 589 592- 
593 596-597 601 608 610 612-614 617- 
624 630-632 635 637 650 658 663-664 
668 676 679 681 689-690 693 699 724 
726 732 736 742-743 747 767-770 780 
784 789 793 799 802-805 813 817-818 
822 824 829-831 837 839 845 848 856 
859-860 864 871-872 875-876 881 887 
896-897 901 903 907 910-911 925 930 
933 943-944 947 952-953 958 962-963 
965 967 972 977 

adult brain 

Clontech 

ABR001 

3 53 66 113 115 126 135 160 172 179 185 
204 263 273 305 312 323 358 380 383 
395-396 403 420 428-429 431 461 542 
583 586 606-607 61 1 620 645-646 688 
690 715 732 736 740 748 754 768 784- 
786 790 796 800 878 897 906-907 947 
977 

adult brain 

Clontech 

ABR006 

19 32 49 53 60 72 91 103 118 125 130- 
131 134 184 224 275 338 350 354 361- 
363 374 384 390 394 396 431-432 434- 
435 445 468 549 621 732 734-736 745 
760-761 764 768-769 775 787 806 81 1 
818 887 903 906 918 930 942 947 957 
973 977 

adult brain 

Clontech 

ABR008 

2-3 9-11 14 17 21 23-25 28-29 31-35 37 
41-42 45 47-48 56-57 65-66 69-70 72 75 
77-78 88 91-92 97-99 101 103 112-1 15 
118-128 130-131 135 138-140 142 144- 
146 148 152 156-157 159-160 163 168 
172 174 176 178-180 182-190 194 196- 
198 200-201 204 209-214 218 220-225 
228-230 232-233 238-240 243-244 246 
254-256 260-264 270 272-274 278-279 
282-285 289-291 293-294 296-297 301 
303-306 312-314 317 321-322 325-328 
334 336 338 340-342 344 346 348 350- 
352 354 356-358 363 366 369-374 376 
379-381 383-386 388-394 398-399 402- 
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403 405 409-412 414 418-421 423-424 
426-427 430 433-437 443 445-450 452 
456-457 460 462 464 471 479 482-483 
485 488 490-498 505 507 510 516 519- 
522 524 527-532 535 538-539 542-545 
548 551 553 555 561-562 566 569 571 
574 580-583 588-589 593 597 601-608 
611-612 614-615 617-618 621-622 624 
630-635 642 644 646-648 650-652 655 
657 659-661 664-665 668 672 674 689 
693-699 701-702 708 71 1 715 717 724 
728-730 732 734-735 738-740 745 747- 
750 753-755 757 761 763-764 766-769 
772-773 775 780-781 789-791 793-795 
799-800 802-806 809 812 818-819 821- 
822 826 829-830 832 834-835 841 843 
845 856 858-859 861 864 866 870 872 

Ci 1 "! r ooa ooo oor 00*7 0 r\"> orvo nno c\(\f 

876 880 883 885 887 893-898 902 906- 

n i c fiio no i no c no £L no n noi no i n/io 

916 918 921 925-926 930-931 933 942- 
943 946 948 950-951 953-954 958-960 

OAO O/CC (\cn o/rn qta 0*71 077 
yoZ-y 0 j y 0 / yoy-y fv y 1 JLy 1 1 

adult brain 

Clontech 

A DDA1 1 

ABKUl 1 

j / [yb Z/U 3U4 344 43o oJ4 

adult brain 

BioCnain 

a DDA1 1 

1/1 00 101 100 1 /CO /C01 

14 lzl-122 loo oyl 

adult brain 

Invitrogen 

ABR013 

72 108 263 270 336 425 492-494 732 787 
ly\) &ZO ooU 

adult brain 

Invitrogen 

ABR014 

293 394 399 764 768-769 928 967 

adult brain 

Invitrogen 

ABR015 

738-739 764 

adult brain 

Invitrogen 

ABR016 

320 374 396 399 405 684 742-743 767 
931 947 967 

adult brain 

Invitrogen 

ABT004 

21 33-34 37-38 47 52 57-58 69 72 91-93 
109 119 122-124 126-127 135 142-143 
158 167-168 185-188 194 200 212 232 
242 246 255 258 270 277 279 293 301 
312-313 319 322-323 331 341 346 348 
371 374 388 391 394 399 401 409 41 1 
429 436-437 456 462 477 488 496 498 
510 512 515 539 542 545 549 559 563 
573 579 587 589 601-605 612 620-621 

f-\ a S~ A f\ /ylO f A^l A"0 1 HI C TOO TOO 

624 640 643 647 68 1 715 723 728 732 
735-736 740 745 748 753 766 785-786 
792-793 797-801 812 822 829-831 853- 
856 859 876-877 884 893-894 908-909 
918 925 933 950 969 978 

cultured 
preadipocytes - 

Strategene 

ADP001 

4 28-29 69 93 114 121 132-133 135 151- 
152 159 167 172 178 181 184 190 194- 
195 203-204 209 217 219 240 248 260- 
262 267 273-274 277 282 297 301 304 

in oiyi oo/: ic\ ° /co m\ 11A 100 
312 314 ilb-52.1 jol-joz 3/1 J/4 joo 

394 401 403 405 411 420 437 453 466- 

467 470 474 478 496 507-509 517 530 

532-533 584 588 593 602-603 608 610 

617-621 630-631 633 639 642-643 661 
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693 729 746 761 765 769 834 842 848 
887 907 923 947-950 957 967 969 

adrenal gland 

Clontech 

ADR002 

1 3 12-13 21 23-24 27-29 67 74 78 103- 
105 108-109 113 115 118 120-121 128- 
133 149 156 160 172 177 182 214 217 
223 232-233 247 254 269-270 273-274 
277 283 285 288 298-299 308 317 319 
328 338 340 342 361-362 364 372 376- 
377 382 384 401-402 405-406 416 420 
431 437 444 446 448 457 462 484 500 
507 517 524 532-533 539 545 554 561- 

<\AO <\6.A <nOQ *\Q1 £fiO £m £fl£ &CY7 

joz jo4 joo jy I ouz-ouj ouo-ou/ odd 
642 646 649 658 664 674 693 703 730 
740 745 752 759 765 767 775 779 799 
809 817-818 839 845 856 859 863 887 
890-891 896 948 953 958 961-963 973 ! 

adult heart 

GIBCO 

AHR001 

1 3-4 8 10 14 20-21 25 28-29 33-34 37-38 

41 48 54-57 65 69-72 75 78 80 82-83 97 

99-100 108 112-115 117-121 123-124 

128-133 141 144-146 149 152 159 162- 

163 168 172 176 179 181 184 186-187 

190-191 201 203 208-209 212 216-218 j 

221 223 227 229 233 244 247 249 253- 

255 258 263-264 267 269-270 274 278 

280-282 285 289 291 295 297-299 301 

303-304 308 313 317 321-322 326 328 

334 344 348 352 358 361-363 370-371 

380 382-383 388 394-396 398 401 403 

405-406 410-416 423 425-427 430-431 

436 452-453 464-465 470-474 481-484 

487-488 490 492-494 496 499-500 505- 

506 508-509 514 523 529-530 533 547- 

548 553 558 563-565 577-578 586-588 

590 593 597 601-603 606-608 610-613 

617-619 621-622 626-628 637-638 642- 

644 652 658 661 672 682-683 688 691 

693 697 699 708 71 1 713 715 732 737 

745 747-748 750-753 759 761 765 768- 

770 775 790 802-803 814-815 818-819 
ein QT7 qiq C/in qao c/ic q<q cai 

862 867 876-877 887 891-892 896 900- 
901 903 905-906 908-909 919-920 922 
925 928 936 939-940 946-947 950 953 
959 967 970-971 973 977 

adult kidney 

GIBCO 

AKD001 

1 3 8 12-14 17 19-25 28-29 33-34 37-39 
41 46-48 50 52 55-60 62 65-67 69 71-72 
75 77-78 82 84 89-90 93 97 108-1 10 1 14- 

11£ 11C 101 107 108 1 in \11 17^ 
110 llo-lZl 1Z3-1ZD IZo lj\J-Ijj IJJ 

138 144 146 149 156 159-161 163-164 
167-172 176 179 184 186-187 189-190 
194 196 200-202 204 209 21 1-212 216- 
217 219 221 223-224 229 232-235 244 
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247 250 253 255-256 258 263-264 268- 
272 274 277-281 283 286 288-290 292 
294-295 297 301 303-309 311-314 316 
319-323 325 328-338 342 348-349 352 
354-355 358 361-363 365 370-371 373 
376-378 380 382-383 388 395-399 401- 
403 405-406 409-413 416 418-420 425- 
428 430-431 440 442 452-454 462 464- 
465 470 472-474 477 479 481 483-485 
487-489 492-495 498-500 504 506 510 
517 522 525 529-530 532-533 539 542- 
543 547 551-552 558 560-564 569-570 
573-574 577-578 580-583 585-590 594- 
596 601-608 610-613 617-621 624 626- 
628 630-631 634-636 639 642-643 648 
652 656 658 664-665 676-677 679 681 
688-691 693 697 699 708 711 715 717 
720-722 724 729-732 738-741 747-748 
751-753 761 765 770-778 780 784 789 
791 793 797 804 813 817 823-824 834 
837 839 842-843 845 848 859 861-862 

864 867 870 876 877 887 880 8Q7 804 
oof- oO/ o l\J o /O-o / / oo / 007 oyz-oy 1 ^ 

896-897 900-901 903 907 913-915 918 
921 923 925 929-930 932 939 942 946- 
947 949-950 953 958-959 961-963 967 
969 972 977 

adult kidney 

Invitrogen 

AKT002 

1 3 16 21 30 32 35 38-41 46-47 56 77 92 
109 123-124 130-131 146 149 161 167- 
168 172 176 190 209 212 234-235 258 
279 292 301 303 308 314 333 355 363 
372 380 383 396 399 402 418-419 426- 
427 43 1 448 454 461 471-474 488-489 
495 498 504 506 508-509 520-521 530 ! 
537 539-541 545 547 563 582-583 592 
613 617-618 621 623-624 633 655 688 
690 693 690 704 713 73? 745 759-75'* 
761 766-768 770 784 789 797 837 842 
848-849 866-867 877 887 893-894 903 
914-915 925 929-930 937 944-945 947- 
949 955 961 967 984 

adult lung 

GIBCO 

ALG001 

1 3 14 18 28-29 38 54-56 59 92 1 10 1 14- 
115 130-131 146 149 156 159 164 167 
176 1 84 209 217 234-236 240 255-256 
258 263-264 269 271 276 280-281 297 
305 308 312 314 322 325 332 336 344 
353 361-362 388 401 410420-421 426- 
427 431 465 469 474 484 498 500 506 
508-509 517 530 532 573 592 596 613 
619-620 623 626-628 638 658 679 681 
684 689 717 731 741 771 791 799 817 
834 845 861-862 864 875-876 901 921 
925 928 932 940 947 949 959 962-963 


110 


WO 01/57190 


PCT/US01/04098 





967 

lymph node 

Ciontech 

ALNOOi 

3 10 1 10 146 160 168 196 209 221 269 

no 1A1 11C 1AQ 1QA /IfK /111 /IOA ylT"> 

I/o Jul 330 J4o Jy4 4Uj 41 1 4/U 42z 
459 464 474 485 503 506-507 532 563 
582 619 623 630-631 642 669 684 697 
713 715 727 747 767 769 789 825 839 
842 849 887 896 913 921 925 

young liver 

GIBCO 

ALV001 

3 14 16 37-38 41 51 56 60 97 104-105 
108 110 117 119 128 130-131 134 139 
149 152 169-172 176 184 189-190 200 
209 212 216 218 228 232 255 258 263 
270-271 275 285-286 292 295 298-299 
301 304 314 341 358 365 368 376 400 
410-412 431 474 481-482 485 496 500 
504-505 517 520-522 524 530 532-533 

G.AH CC1 <TQ1 COO /CI 1 jCOI C~lA 

j4/ jdI OoJ joI joi oIU-oll oZl oz4 
635 643 691 708 711 715 720 752 755 
761 768 796-797 811 818 830 845-847 
852 864-865 867-869 896 899 910-91 1 
949 958 965 969 972-973 

adult liver 

Invitrogen 

ALV002 

3 37 42 56 60 71 82 104-105 114-115 
117-118 125 130-131 134-135 164 169- 
172 176 179 200 203-204 212 217 223 
226 232 237 244 263 274-275 292 301 
310-312 314 317 349 354 364 368 372 
376 398-399 402 426-427 439 442 451 
458 465 474 482 485 490 506 515 525 
527 545 547 552 568 571 573-575 582 

co*7 cam enr £L(\A /CAC £LC\Q £1A /C0 1 £L1f\ 

jisf by4-jyj o04-6Uj> ouo oil) ozl oiu- 

£1 1 &1A K£A £QCl £G1 AOQ 
Oj 1 034-OJ J O d / OJ / Oo4 oy\J Oyj Oyy 

723 726 745 751 763 767 784 793 81 1 
822 845 848 852 856 861-862 864 892 
899 908-909 925 950 958 967 983 

adult liver 

Ciontech 

ALV003 

60 134 169-171 275 

adult ovary 

Invitrogen 

AOV001 

1 3 9-10 12-14 16 18 20 22-25 28-29 33- 
35 37 39 41-42 46 48-50 55-57 59 63-67 
69 71-72 75 77-80 82 88-89 92 101 103- 
106 108-110 113 115 119-121 123-126 
128-133 135 138 142-146 149 151-152 
159-161 167-168 172 174 176-177 179 
181 184-190 194 198 200 203 208-209 
211-212 214 217 219 221 224 226 232- 
235 240-242 246-247 249 251 254-255 
258-259 264 269-271 274 276-277 279- 
283 285 288 290 293-294 297 301-304 
306-308 311 314 319-322 325-326 328- 
329 331-332 335-338 341-342 344 348 

1S,A ^8 1£1 1^ 1£8 11C\ 111 *X1A 

376 379-380 382-383 388 394-396 398- 
399 401-402 405-406 409-412 416 418- 
421 423 425-433 438 442-443 449-452 
454 462 464 466-467 469-471 474 479 
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482-484 488 490 492-496 498 500-504 
506-509 511 515-518 520-524 529-530 
532-533 537 539-542 545 551 555 558 
560-565 569 571 573 577-578 581-583 
585-590 592-593 596-597 600-605 608 
610-611 613-614 617-628 633-637 639 
642-643 646-648 650 652 654 656 658 
664 668-670 672 674 679 681 684 688 
691 693 697-699 701-702 713 717 721- 
722 724 729-732 738-744 747-750 752- 
753 755 759 761 765 767-774 779-780 
783-784 789 793 795-797 801 813-818 
823-824 828 830-832 834 837 839 841- 
842 845 848-851 856 859 862 864 866- 
867 870-871 874-878 881-883 887-889 
891 893-894 896-897 901 903 906-911 
913 919-922 925 928 930 936 939-940 
943-944 946-947 949-950 952-953 955 
957-958 962-963 965 967 969 971 973 
977 981-982 

adult placenta 

Invitrogen 

APL001 

41 56 67 253 301 304 334 380 383 451 
474 479 500 577-578 643 648 729 767 
856 859 866 873 962-963 

placenta 

Invitrogen 

APL002 

3 21 31 38 63-64 78 135 143 168 186-187 
212 232 244 263 280-281 334 336 344 
348 371 374 394 399 461 490 582 588 
602-607 610 620 699 745 769 793 817 
822 859 897-898 923 928 931 943 949 
969 973 

adult spleen 

GIBCO 

ASP001 

1 3 21-22 46 52 54-55 57-58 61-62 72 74 
78 82 88 118 121 130-131 137 152 159 
168 172 189 203 209 217 223 234-235 
252 255 263 269 271 274 282 288 290 
301 314 322 335 350 363 394 403 405- 
406 410-412 415 431 459 464 472-474 
482 488 500 506 510 514 517 532 537 
542 561-563 589 593 602-603 610 613 
619 621 636 642-643 655 658 662 674 
676 679 681-682 684 689 691-692 697 
699 715 720 723 729 747-748 769-770 
782 793 818 830 834 845 856 859 862 
877 887 893-894 896 903 906-907 914- 
915 918 925 928 930 940 946 965 967 
977 982 

testis 

GIBCO 

ATS001 

6 22 28-29 33-34 41 48 52 62 65 72 97 
106 109 118 132-133 145-146 168 172 
176 183 185 189-191 195 209 211-212 
214 221 223 230 254-255 258 263 269 1 
283 297 312 314 321 342 352 361-362 
365 380 383 388 395 401 405-406 412 
430-431 441 469-470 474 479 495-496 
500 506 520-521 533 543 545 548 560 
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563 574 582 589-590 593 608 616-618 

71 1 745 747-748 765 767-768 779 784 
789 812-813 834 837 839 848 859 862 
868-869 875-877 887 889 893-894 896 
928 944 947 953-955 972 981 

Genomic DNA 
from BAC 
63118 

Research 
Genetics 
(CITB BAC 

jL/tuicu y j 

BAC001 

515 

Genomic DNA 
from BAC 
39316 

Research 
Genetics 
(CITB BAC 

L/iuiui y i 

BAC002 

640 

Genomic DNA 
from BAC 
39316 

Research 
Genetics 
(CITB BAC 
Library) 

BAC003 

640 

adult bladder 

Invitrogen 

BLD001 

50 55 66 71 1 1 1 143-144 148 160 201 209 

9?^ ?S6 ?R6 10S ^10 

AjLD ZJJ-Z,JO ZOU ZOI ZOO D\JJ Jl J J17 

340 394 431 442 488 497 505 518 552 
588-589 621 636 664 676 715 738-739 
769 790 824 837 845 877 887 936 940 
948 962-963 967 

bone marrow 

Clontech 

BMD001 

3 10-13 16 18 20-21 25 28-29 31-34 41 45 
48 52 54-55 57 59 61 65 67 72-73 75 78 
80 82 84 99 103 108 110 114-115 118- 
120 123-124 128 130-133 143-144 148 
152 159-161 163 168 172 174 176 178 
190 192 198 203 209 211 217-218 221 
223-224 227 233-236 244 247 249 252 
254 258 260-262 267 269 272 278 280- 
281 284-285 288 290 294-297 301 304 
308 314 317-318 320-321 325 328-330 
333-335 349 351-354 358 363 365 367 
377 382 388 394-397 400 405 408 410- 
412 418-421 425-428 431 433 435 442 
449-450 453 455 459 464 468-470 474 
478-479 481 484 490 496 504 506 508- 
509 511 519-521 530 532 539 553 558- 
559 561-563 580 582 586 592 599 608 
610 613-614 617-619 623 625-628 635 
638 641-643 658 664 672 682 699 71 1 
713 717 731 734 740 742-743 745 761 
768-771 774 776-778 784 787 789 813 

R1 7 R1R R7? R^4 R3Q Rdft R4? R4R Rfi? 

866 870 876 885-887 891 896-898 900 
903 906 913 919 921-922 927-928 939 
944 947 950 953 959 961-963 967-968 
970 973 977 

bone marrow 

Clontech 

BMD002 

3 9-10 15-19 30 33-34 39 45 54 57 63-64 
71 82 102 116 119 130-133 148 152 156 
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159-160 168 176 182 224 254-255 271- 
272 282 285 290 297-299 301 305 323 
333 340 344 351-355 358 361-362 364 
367 370 372 387 394-395 399 403 405 
409 411 449-450 459 461 468 474 488- 
489 524 530 532 580-582 592 602-603 
611 617-618 621-622 630-632 642 661 
663 694 717 730 734 740 74S 7S9 7S 1 ? 
761 767 769-771 775-778 784 787 811 

» v/ i i \j i i \jy i i i i r *j i t o / (Of Oil 

813 818 832 840 842 849 859 878 887 
893-894 896-898 903 906 908-909 923 
928 944 946-949 953 958-963 965 982 

bone marrow 

Clontech 

BMD004 

54 

bone marrow 

Clontech 

BMD007 

766 887 928 

adult colon 

Invitrnopn 

111 V 111 \J tz.\^l.l 

CI N001 

99 37 67 97 1 17 191 148-140 168 17? 1Qfl 
200 204-205 232 244 263 268 292 301- 
302 363 377 384 452 455 459 470 530 
582 602-603 619 687 793 798 7S1 761 
831 861 887 914-916 934 955 969 984 

Mixture of 16 
tissues — 
mRNAs* 

Various 
Vendors* 

CTL016 

358 740 760 

Mixture of 16 
tissues - 
mRNAs* 

Various 
Vendors* 

CTL021 

468 527 928 

adult cervix 

BioChain 

CVX001 

1 3 10 14 22 28-30 37 41 47-48 51-52 54- 
57 71 82 89-90 92 106 108 1 10-1 11 117- 
118 121 129-131 135 141 143-146 160- 
161 164 168 172 177 189-190 193 195 
200 204 209 21 1-212 217 226 229-230 
232 234-235 240-242 246 254 260-263 
268-270 274 277 282 285 292 295 297 
305-308 314-316 319 328 343-344 348 
354 358 363 368 380 382-384 389 394 
396 399 401 405-407 410 416 41 8-421 
428 430-431 437 442 453-454 459 464 
469 471-473 476 480 484 492-495 500 
504 506-509 516-517 526 530 532 545 
550-551 563-565 569 577-578 585-586 ! 
590 608 611 613 619 621 623 628 630- 
631 634-637 641 643 648 656-658 664- 
665 674 679 682 689-690 693 700 703 
708 713 721-722 724 728 732 742-743 
747 750 752 755 757 761 763 767-769 


* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain mRNA (Invitrogen), 2) 
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA (Invitrogen), 4) normal fetal brain mRNA 
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal 
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA (Clontech), 
10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus mRNA (Clontech), 12) human lymph 
node mRNA (Clontech), 13) human spinal cord mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) 
human esophagus mRNA (BioChain), 16) human conception^ umbilical cord mRNA (BioChain). 
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779-780 784 788 8 1 0-8 1 1 8 1 3-8 1 5 822 
834 836-837 839 848 861 866-867 871 
874 877 887 891-894 897-898 901 913 
916 919 921-922 925 946-947 953 958- 
959 967 969 973 

diaphragm 

BioChain 

DIA002 

3 39 184 203 431 563 848 967 

endothelial 
cells 

Strategene 

EDT001 

3 6 8-10 14 19-24 28-29 33-34 37 39 41 
46 48 52 55-58 62-65 67 69 71-72 75 78 
80 82-83 87 101-102 108-109 1 14-1 15 
117 123-124 128 130-133 135 138 143 
145-146 149 156 159-160 167-168 172 
174 176-177 179 181 184-187 189-190 
194-195 200 203 208-209 212 216-217 
219 223-224 226-227 229 234-235 244 
248-249 254-256 258 263-264 267 269 
271 274 276-282 285 290-291 294 297 
30 1 -304 308 3 1 1 3 1 3-3 1 4 3 1 6-3 1 7 320- 
321 323 325-326 328-329 331-332 334- 
337 339-341 344 348-349 352 354-355 
358 361-363 365 367 371-372 375 379- 
380 383 389 394-395 398-403 405-406 
409-412 425-428 437 442-443 448 454 
464 466-467 474 479 481 490 492-498 
500 503 506-509 511 517 520-521 523- 
524 530 532 537 540-542 558 561-563 
565 569-570 573 581-583 586 588-589 
596 602-608 610-611 613 617-622 625 
628 630-631 633-637 642-643 646 648 
650 652 659 661-662 682 688 690-693 
696 698-699 708 712 715 717 720-722 
724 727 729 740 745 748-750 752 761 
765 767-770 772-773 779 784 789 792- 
794 796 802-803 811 817-818 821 824 
827-828 830 834-835 837 842 845 848 
859 861-862 864 866-867 870 876 885 
887 891 893-894 897-898 900 903 906- 
907 913 916 921 925 939 947 950 953 
955 957-958 962-963 967 973 978 984 

Genomic 
clones from the 
short arm of 
chromosome 8 

Genomic 
DNA from 
Genetic 
Research 

EPM001 

324 515 640 

esophagus 

BioChain 

ESO002 

97 103 128 371 474 

fetal brain 

Clontech 

FBR001 

67 129 156 159 232 267 433 446 503 845 
952 

fetal brain 

Clontech 

FBR004 

28-29 185 213 277 350 384 432 485 501 
549 651 747 754 761 780 787 848 870 
887 906 958 

fetal brain 

Clontech 

FBR006 

10-1 1 14 21 30 32 47 49 56 65 69 72 77- 
78 82 84 97 101 115 118 121 125 128 
130-131 138 142 148 152 159-160 179 
185 188 194 197 203 210 212 214 219 
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222 227-229 243-246 249 252 256 264 
270 273 282 285 290-291 293 301-303 
305-306 312 321-322 325 327 339-340 
344 346 350 354-357 363 367-371 374 
388 391 394-395 399 402 405-406 410 
414 420 426-427 436-437 442 444 454 
456-457 460 462 464 470 480 485 492- 
494 507 510 516 524 528 530-532 539- 
542 549 553-554 561-562 580-582 588- 
589 602-608 61 1 615 617-619 621-622 
624 632 636 641-642 646-647 651-653 
661-662 666-669 672 677 691 715-716 
730 735 740 752 754 761 767-770 772- 
775 780-781 799-801 808 818 822-823 
835 843 845 856 859 864 867 876 880 
885 887 890 893-894 896 913 918 926 
942 946-947 951 957-959 962-963 970- 
971 

fetal brain 

Clontech 

FBRs03 

130-131 312 517 637 691 738-739 

fetal brain 

Invitrogen 

FBT002 

3 22 28-31 47 57 63-64 72 75 77-78 86 
94-95 97-98 126-127 135 140 143 156 
159-160 167-168 177 185 190 196 201 
203-204 214 217 230 254-255 258 267 
273-274 277 279 282-283 292 301-302 
305 312 314 323 329 346 348 367 374 
382 394 399 401 403 412 415 420 432 
437 474 482 485 495 507 513 517 527 
529-530 539-542 548 552 579 587-588 
600 604-605 612 617-618 621-622 624 
634 642-643 647-648 650 679 689 693 
699 712 715 742-743 745 748-749 753 
768-769 793 797 829-831 834 845 848 
856 859 893-894 908-909 913 916 931 
933 940 950 967 969 

fetal heart 

Invitrogen 

FHR.001 

19 57 130-131 394 431 642 769 844 

fetal kidney 

Clontech 

FKD001 

3 31 33-34 38 48 54 72 160 208-209 21 1 
223 264 269 277 283 290 313 325 341 
348 358 396 418-420 474 484 506 508- 
509 517 520-521 532 547 553 558 567 
569 587 596 608 610 613 619 622 626- 
627 642 679 734 745 818 843 887 896 
903 916 969 971 

fetal kidney 

Clontech 

FKD002 

19 474 726 903 

fetal kidney 

Invitrogen 

FKD007 

3 118 186-187 230 244271 432 887 969 

fetal lung 

Clontech 

FLG001 

69 132-133 156 168 208-209217 267269 
274-275 286 354 394 396 406 462 483- 
484 608 619 751 769 771 834 914-915 
925 

fetal lung 

Invitrogen 

FLG003 

3 8 28-29 32 39 50 66 82 88 92 168 1 86- 
187 200 204 212 226 229 246 274 309 
327 332 368 374 382 394 398 426-427 
431-432 442 485 536 555-557 587 604- 
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605 621 624 636 642-643 661 677-678 
724 753 769 848 859 864 877-878 896 
902 904 914-915 958 

fetal lung 

Clontech 

FLG004 

130-131 394 664 769 942 

fetal liver- 
spleen 

Columbia 
University 

FLS001 

3 8-10 12-13 16-17 19-25 27-29 33-35 37- 
38 41 45-46 48 52 55-58 60-67 69 71-74 
77-78 80 82 84 87-90 104-106 108-109 
112-121 123-125 128-134 138 141 143- 
146 149 151 156 159 163-164 167-172 
174 176-179 181 184 186-188 190 194 
200-201 203 208-209 21 1-212 216-217 
219 224-227 229-230 232 234-235 237 
241 243-244 246-248 254-255 258 260- 
263 267 269-270 273-282 284-285 288- 
290 292-295 297-299 301-306 308 31*1- 
318 320-323 326 328 332 335 341-344 
348 352 354-359 361-365 367-368 371- 
374 376-380 382-383 388-389394-396 
398-399 401-411 413-414 416 418-421 
425 428-430 432-433 437 439 442-444 
449-450 452 456-457 461-470 472-474 
478-479 481-482 484-485 487 490-494 
497-499 504-507 51 1 514-515 517-521 
523-524 526 529 532 537 540-541 547 
555 558-559 563 575 577-578 580-596 
598-599 601-603 606-608 610-613 617- 
624 626-628 630-631 634-636 639 642- 
643 647-648 654-656 663-665 672 674- 
675 679 681 684 686 688 691 693-699 
711 713 715 717 719-726 729 732-733 
738-740 745 748-749 751-753 757 759 
761 767-770 776-778 780 784 787 792- 
794 799 804 809 81 1 813 817-819 822- 
825 830-831 834 837 840 842 845-848 
852 856 859 861-862 865 867-869 871 
874-878 887-888 891 893-894 896-900 

O / i O / O OO / ODD u/ 1 OyT^ kj y\J J\J\J 

903 905-911 913 916 918 923 928 930- 
931 936 939 942 944 946-950 952 958- 
959 961-963 965 967 969-970 972-973 
976-977 981-983 

fetal liver- 
spleen 

Columbia 
University 

FLS002 

3 8-13 15-17 19-2022 25 28-29 33-35 37 
41 45-46 52 54-56 60-61 63-64 66-70 73- 
74 78 80 82 92 99 104-106 108-109 1 12 
115-116 118 120-121 123-125 128 132- 
135 139 141 143-144 146 149 152 156 
159-161 167 169-172 174 176-177 179 
181 185 188 190 194 196-197 200 204 
?12 214 216-218 223-224 226-230 232- 
235 237 246-247 252 254-255 258-263 
267 270-277 284-286 288 292 294-295 
297-299 301 303-305 308 310 314 318 
320 323 328 330-332 335-337 340 342- 
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344 352 354-355 358 361-365 367-368 
371 373-374 376-377 382 388 394-396 
398-399 401 405-406 409-41 1 413 418- 
421 429 431 439-440 442-444 451-452 
457 462.-463 466-468 470 474 477-479 
481 483-484 487-488 491 495 499 504 
508-509 516 519-521 524 526-528 530 
532 537 540-541 543 545-547 550-551 
553 555 560 564 568 574-575 577-578 
580-592 596-597 600 602-603 608 610- 
611 613-614 617-618 621-622 628 630- 
631 634 637 639 642 644 647 654 658- 
659 665-667 669-675 679 681 684-685 
688-690 693 695 697 708 71 1 713 715 
717-719 723-727 729 731-734 738-739 
741 745-746 749-750 753 759 761 766- 
767 769-770 776-779 782 784 791-792 
794 805 808 817-818 822 824-825 830 
834 837 842 845-849 852 856 859 864- 

0/C£ QCH QHA OTO OOO OH1 OflO OC\C Clf\C\ 

ooj 00/ 0/4-0/0 OOO oyi-oyz oyo-you 
903 905-906 908-909 913 916 918 921 
923 925 932 936 939-940 942 944 946- 

QAH Q/IO O^A QC< 0<rf£ OCQ o<o 0£1 

V4/ y'ty-yDV yji yDD-yjo yja-yjy yol- 
963 965 968-970 973 977-978 981 

fetal liver- 
spleen 

Columbia 
University 

FLS003 

19 60 78 224 273 275 370 373-374 401 
602-603 639 643 730 732 738-739 748 
752 770 782 928 930 947 949 

fetal liver 

Invitrogen 

FLV001 

37 55 60 69 72-73 97 104-105 108 113- 
114 116-118 121 135 143 152 167-168 
186-187 195 200-201 209 217 223 240 
244 253 255 275 284 301 311 314 31.7 
336 342 348-349 358 371 374 382 394 
402 41 1-412 418-419 428 430 442 453 
517 568-569 580 582 584 587 589 601- 
603 606-608 617-618 624 634 639 642- 

AAA &A& ££A £7Q 71 < 717 77H 
044 040 004-00 J OOy O/y 1 1 j 1 1 / izX) 

70£ 1A <\ 7zi£ 7^1 7£Q 770 787 701 7Q/L 
/ZO /43 /45 /Jl /Oy-//W /oZ fyl fy^t 

797 824 830-831 845-847 852 859 870 
oyy yij-7iu yz.j yz,o yno yjv yjo y\jy 
976 982 

fetal liver 

Clontech 

FLV002 

72 418-419 632 

fetal liver 

Clontech 

FLV004 

3 160 169-171 355 367 374 376 547 617- 
618 621 646 717 741 771 836 878 976 

fetal muscle 

Invitrogen 

FMS001 

15 27 32 37 67 72 83 99 112 121 138 167 
174 177 186-187 190 203-204 211 215 
230 252 259 312 374 403 406 409 457 
461 485 505 517 528 530 540-541 544 

SAQ 'i'iA <\SR <\7Q ^Rft fifl? fifH fiflR 
DHy JjH J jo J/7-JoU JOJ OUZ-OOj OUO 

639 642-643 654 664 699 715 730 737 
751 772-773 788 802-803 810 848 856 
859 864 868-869 887 893-894 905-906 
910-911 923 948 967 
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fetal muscle 

Invitrogen 

FMS002 

15 99 130-131 223 361-362 431 474 505 
581 639 643 666-667 784 790 808 810- 
81 1 874 880 887 903 946 950 958 962- 
963 973 

fetal skin 

Invitrogen 

FSK001 

3 6 20-22 32-34 41-45 47 49-52 55 63-64 
66 69 77 80 88 91 98 101 1 1 1-1 12 1 15 
126 130-131 135 142 144 146 160 163 
167 176 188-190 196 201 204 208 213 
215 217-218 229 232 244 246 248 255 
263 265-269 274 279-281 283 285 288 
292 294 297 301 303 308 314 321 341- 
342 344 348 354-355 358 361-362 366 
369 371-372 374 381-382 384 386 394 
401 403 405 413 415 428 431 437 440 
460 466-467 472-473 477 481 483 495 
499 504 517 522 532 536-537 539-541 
545 556-558 569 574 576-578 580 584- 
585 587-589 592-593 602-603 606-608 
612 617-618 621 624 634 637 639 642- 
643 647 664 673-674 676 680-681 689 
699 705-707 709-715 724 728-730 738- 

H Af\ 1A C HAQ H^') l/ZZ HA.Q T/Cd T70 T7-2 

/4U /4j /4o IjI iod /o&-/oy IIZ-1 15 
793 797 817 823 830 834 842 848 859 
861 864 870 874 883 887-888 893-894 
901 904 908-909 913-916 923 925 947 

Q^H Q^Q O^O Q£/l Q&H 0^7^ 

y jU y jo yoZ-yo't yo/ y / j 

ieiai sKin 

— ; 

Invitrogen 

r oJSAJUZ 

1 1 1(\ 1 3 1 1 AC* 1 QA 1fi£ 1*\A A(\C\ ACi< 
j I jU" 1 j 1 HO IVH JUO Jj'f JO/ *tUU *fUJ 

474 489 520-521 547 558 561-562 585 
596 730 740 748 755 767 771 810 840 
893-894 946 959 

fetal spleen 

BioChain 

FSP001 

276 563 842 

umbilical cord 

BioChain 

FUC001 

3 20 33-34 39 48 50 52 55-57 65 67 69 72 
77 79 82 92 109 112-113 121 132-133 
138-143 156 167-168 172 174 179 184- 
185 190 194-196 200 202-203 208-209 
229-230 244 269-271 278 284-285 290 
297-299 303 305 308 320 331-332 336 
338 342-343 363 367 372 374 379-380 
383-384 392-394 397 399 402 405-406 
410 425-427 429-430 449-450 474 476 
484 497 499 501 504-505 510 515 517 
532-533 539 549 551 558 563 569 574 
577-578 581 586-587 597 602-603 608 
610 617-619 621 626-627 634-637 639 
642-643 658 663-664 674 690-691 693- 
694 699 713 715-717 720 724 726 729 
738-739 746-747 749 759 761 765 768- 

7fiQ 774.77'? 7Q3 707 Qfi7 CI 8 899 8^7 
toy / / 4— / /yj iy / ou/ 010 oaz od i 

848-849 856 862 868-869 874 885 887 

892-894 903 906-907 916-917 919-920 

928 936 939 944 946-947 962-963 967 

969 
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fetal brain 

GIBCO 

HFB001 

3 9-10 12-14 16 21 25 28-30 32-34 37-39 
41 47-48 52-53 56 65 67 69 71-72 75 80 
84 92 97 103 106 110 114 117-119 123- 
124 127 129 132-133 135 138 141-142 
144-146 148-149 152 156 159-160 168 
172 174 176 179 181 184-185 190 198 
208-209 212 214 219 221 223-224 229- 
230 233-236 240 244 247 251 253-255 
258-259 270 273 276-277 285 297 304- 
305 308 312 314 322-323 325 328 332- 
333 335-337 339-340 342-344 346 352 
354 358 363 365 370-372 374 382 394- 
396 398 401 403 405-406 409-412 414 
416 425-427 431-432 437 442 445 453 
456 462 466-467 469-470 472-474 479 
483 488 490 492-497 500-501 504 506- 
510 520-521 524 530 537 539 545 549 
552 558 560-562 564 569 579 582-583 
586-587 596 602-608 610-612 614 617- 
624 626-628 630-631 633 635 638 641 
643 647-648 656 658 661 676 679 688- 
689 693 696-697 71 1-712 715 724 726 
731 735 745 747-749 752 754 761 765 
767-770 774 779-781 784-786 789 799- 
800 802-803 813 818-819 823-824 831 
834-835 837 839 845 848 859 864 866- 
867 871 874-875 881 887 891 893-894 
896-897 900 906-907 91 0-9 ll 918 921- 
922 925 927-928 930 943-944 946-947 
950 953 962-963 965 969 972-973 977 

macrophage 

Invitrogen 

HMP001 

86 168 186-187 297 537 608 681 761 845 
877 

infant brain 

Columbia 
University 

IB2002 

2-3 9-10 12-14 16 21 25 27-30 32 37-38 
46-47 49 55-56 58 65 69 71-72 78-79 82 
84-86 91-92 98-99 106 109-110 113-115 
118 127-128 130-133 135 138 142 144 
151 156 168 173-176 180-181 185-188 
192 194 196-201 203 208 210-212 214 
217-218 224 229-231 233 236 238 240- 
241 244 246 251-256 259 263 270-271 
277-279 284-285 287 293-294 296 301- 
302 308 312-314 317 322-323 327 330 
333 339 342 345-346 351 354 358 361- 
362 365-366 368 370-371 373-374 382 
388 394-396 402 405-406 411-412 415- 
416 420 424-425 428 431 436-437 440- 
441 444-445 453 456 460 465 474 479 

/ICO /ICl /100 AQ< /1Q£ AQQ 

506-510 515-517 520-521 524-525 529 
531-532 534-535 537 539-542 544-545 
549 561-562 569 574 577-578 580-583 
586-587 589 592 596 600-608 610 612- 
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613 616-618 620 622 624 629-632 634- 
635 637 641 643-644 650-651 653 661 
663-664 676-677 689 693 695-698 708 
71 1 720-722.724 730 732 735 740 745-. 
748 754 765-766 768-769 779-781 785- 
786 789 791 796 798 800-803 807 811- 
813 818-819 822-824 830-831 834-835 
837 839 842-843 845 854 856 858 864 
867-869 875-877 879 881 887 892-894 
896 903 907-91 1 913 916 919-920 925 
930-932 936 939 943 946-947 953 958 
970-973 977-978 982 984 

infant brain 

Columbia 
University 

IB2003 

3 12-13 21 27-29 32 39 49 69 72 82 91 
113 116 126 128 132-133 142 144 156 
176-177 184-185 188 194 208 212223- 
224 228 230 244 255 259 267 270 273 
276 293-294 312 320 326-327 337 342 
346 354-355 358 361-363 382 388 390 
394 396 399 402 420 425 431 442 462 
474 482 484 488 495-496 510 520-522 
524 529 540-541 549 563 582 586 588- 
589 596 600-603 606-607 612 617-618 
620-621 632 647 650 679 720-722 724 
735-736 746 751 754 769 785-786 793 
800 807 81 1-813 818-819 822 824 831 
834 838-840 843 856 864 892 896 907 
919-920 925 930-931 936 947 950 957 
973 982 

infant brain 

Columbia 
University 

IBM002 

16 47 82 84 201 263 302 376 394 421 440 
488 537 592 606-607 635 740 769 887 
892 906 921 926 971 

infant brain 

Columbia 
University 

IBS001 

84 86 180 185 198 201 203 230 279 312 
326 346 354 366 388 488 542 581 588 
620 647 664 732 740 785-786 801 807 
822 827 910-911 925 931 

lung, fibroblast 

Strategene 

LFB001 

3 1 1 25 49 65 75 1 14 141 156 160 172 
190 198 209 217 224 229 234-235 267 
269 274 277 282 284 303 308 312 320 
334 336 352 372 396 398 412 414 437 
453 464 470 481 492-494 508-509 532 
539 581 584 617-619 621 628 633 643 
688 691 745 752 761 768 794 822 837 
848 876 887 953 967 973 

lung tumor 

Invitrogen 

LGT002 

1 3 9-1012-13 20 31 38 41 4648 51-52 
56 58 63-64 72 74-75 78 82 88 101 106- 
107 110 114-115 117-118 120-121 123- 
124 128-133 135 143-146 149 151 156 
159-161 163-164 167-168 172 176 178- 
179184-185 189-191 194-196 200 203 
209 212 216-217 226 228-229 232 234- 
236 241 246 248 256 258-259 263-264 
269-271 274 282-283 285-286 290 292 
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294 297 301 308-309 311 314 317 321 
326 328-329 331 333-334 341 348 352 
354-355 363 365 371 380 382-383 388 
394-395 398-402 405-406 410-41 1 413 
416 418-419 426-427 439 442 452-453 
458-459 461-462 464-465 470-471 474 
478 483-484 490 495-496 499 510 522 
524 528 536-537 540-541 543 548 556- 
558 560-565 571-573 580 582 587-588 
592 597 602-605 608 610 612-613 617- 
622 625-629 633-634 636 642-644 648 
661 664 669 679 688-689 691 693 699- 
700 708 717 723-724 730 733-734 738- 
740 745 747 749 752-753 761 767-768 
770 779 782 784-786 789 793-794 797 
817-818 820 823-824 834 837 842 845 
848 855 857 859 862 864 866 870 875- 
877 887 892 896 900-901 907-909 914- 
915 919-920 923-925 939 943 947 949 
953 958 962-963 965 968 970 972-973 
977 

lymphocytes 

ATCC 

LPC001 

3 9-1 1 32 47 50 56 71 75 88 97 99 102 
121 125 128-129 135 138 141 149 163 
167-168 212-213 217 233 255 290 294 
301 305 3 1 1 3 14 342 372 377 388 398- 
399 410 437 442 453 470 474 481 495 
500 506 510 529 532 537 542 558 571 
579 604-605 610 620 628 637 643 658 
666-667 676 679 697 708 713 728 730 
734 749 765 768 796 807 818 822 834 
839 848 859 875 885 887 896 903 906 
914-915 928 947 973 981-982 

leukocyte 

GIBCO 

LUC001 

1 39 11 18-19 21 23-25 27 31-34 39 41- 
42 46-48 52 54-58 62-69 71-72 74-75 78- 
80 82 89-90 93 99 110 115-121 123-124 
128-133 135 138 141 143-146 149 152 
156 159-161 163 167-168 176 179 181 
186-187 189-190 194 198 200 203-204 
209 21 1-212 218-219 226 232-236 240 
244 247 251 253-255 258-259 263-264 
269 271 274 278-279 282-283 285 288- 
290 294-295 297 301-306 311 313-314 
317 320-321 325 328 330-331 335 337 
342 344 348 350-351 353-354 358-359 
361-365 368 371-372 375 388-389 394- 
395 397-401 403 405 407 409-412 421 
425-427 432 437 442 448-450 452 457 
460-461 468-471 474 476 479-482 484 
492-494 496-498 500 506-510 516-517 
520-521 524 529-530 532 537 540-544 
551 553-554 558 560-565 569 577-578 
580-583 586-587 589 592 596-597 602- 
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603 606-608 610-624 626-628 630-631 
634-635 641-643 654 657-658 661 663- 
665 669 672 677 679 684-689 691 696- 
697 699 708 711 713 715 717 721-724 
728 730 738-740 747-749 755 761 765 
767-769 771 774-779 782 784 789 791- 
792 794-795 797 807-808 811-815 817- 
818 822 824 828 830 832 834 839-840 
842 845 848 856 859 862 864 867 871 

87^ 877 887 8Q1 807 QQA 8Q£ 8G8 om 
o/j-o// 00/ oyi oyj-oy* 0:70-0:70 y\)j 

906-911 913-916 921 923 925 927-928 
930 932 935-936 939 943-944 947 949- 
950 953 958-959 961-963 965 967 972- 
973 982 

leukocyte 

Clontech 

LUC003 

1 41 82 106 119 123-124 160 177 184 201 
ZlZ zZl ZZo Z/l Z/y Zo5 zyj izl J25 
372 394 41 1-412 443 468-470 530 532 
537 551 569 580-581 613 619 623 626- 
627 642 655 697 761 767 769 775 789 
809 867 887 923 928 950 

melanoma 
from cell line 
ATCC #CRL 
1424 

Clontech 

MEL004 

3 25 55-56 67 71 78 109 121 129 146 167 
172-173 176 200 209 212 258-259 263 
278 297 301 306 312 335 338 340 352 
361-362 367 388 395 402 410 418-419 
429 437 454 464-465 481 496 500 503 

cAn co /i cio rin c /^n czro coi coo coi 

50/ 524 532 5jy 560-562 581-582 587 
589 599 612-613 617-621 623 643 657 
663-664 672 7 1 5 724 748 752 76 1 767- 
768 770 785-786 789 835 848 877 887 
896 916 919-920 947 967 978-980 

mammary 
gland 

Invitrogen 

MMG001 

1 14 19 21 28-29 31-37 47 49-51 55 57 
63-67 69 71-72 75-78 92 108-109 111 116 
121 123-124 126 128 130-133 135 143- 
144 148-150 156 159 164 168 172 177- 
179 184 186-187 190 194 200-204 209 
212 217 226 230 232-236 241 244 246- 
247 252 255 258-259 263 268 270 275 
279-283 285 290 292-293 301 304-305 
311 313-314 317 320 322-323 326-327 
330 332 338 342-344 348-349 354 360 
363 367 371 374 380 382-383 385 388 
394-395 398 401-403 407 409 41 1-412 
418-420 426-427 430 435 437 442 449- 
453 459 461 465-468 470 474 477-478 
480 483 485 488 498 500 503-504 507 
515 519 522 524 529-532 538-541 544 
547 555 560 563 565 569 573-574 579- 

SRft SR? SRA S87 S80 *»Q^ 601 £10 
jov/ joi Jot JO/-J07 J7J jjI pui-Oiu 

612-613 615-618 620-622 624 634 636- 
637 639 642-644 646-647 650 '657 663- 
664 674 676 679 688-689 691 693 696 
701-703 713 715 717 728 730 732 738- 
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739 741-743 745 749 751 753 763 767 
769 772-773 785-786 793 796-797 812 

qti OO/I Q^fl Qln QAQ OC/C OCh oz"1 

ftzl-6z4 ojU-ojJ 83/ o4a ojo 859 861 
864 868-870 876-877 887 891 893-894 
898 903-904 907-91 1 913-918 921 923 
925-926 930-931 936 942 949-950 958 
961 966-967 969 972-973 

induced neuron 
cells 

Strategene 

NTD001 

9 65 82 92 106 113 142 146 156 172 176 

1Q1 OflQ 001 TCC T77 IOC 111 1AC Q/CI 

iyi ZU8 zzl zj8 2.1 1 3Z8 333 34o 361- 
362 371-372 375 388 410 414 418-419 
440 47 1 484 495 5 1 6 524 529-530 592 

£1fi £9Q AAO 1A^ 7/1C 7^0 "7£1 7Qa 

oiu ozo O'fz oou /*o /ho / jz /oi /yj 
818 848 851 897 

retinoid acid 
induced neuron 
cells 

Strategene 

NTR001 

19 87 184 305 385 440 474 626-627 643 
748 799 834 977 

neuronal cells 

Strategene 

NTU001 

19 33-34 42 70 82 87 109 115 126 146 
172 185 188 194 212 255 269 274 283 
31z 31/ 329 34U 361-362 36/ 3/9 394 
399 401 410 420 426-427 474 479 507 
530 579 582-583 610 617-618 636 643 
ojo in /4u /to /t>y /i>4 /yi /y3 /yy 
ouz-ouj oio o4z oji oo4 sy/ yu/ y3z 

pituitary gland 

Clontech 

PIT004 

3 19 123-124 194 255 354 358 373-374 
377 426-427 462 492-494 635 785-786 
793 893-894 

placenta 

Clontech 

PLA003 

138 176 574 896 972 

prostate 

Clontech 

PRT001 

3 9 16 57 65 75 83 108 130-134 138 141 
146 149-150 159 182 186-187 190 203 
209 234-235 276 283 322 413 415 442 
449-450 453 480 484 490 499-500 503 
5U5-5U6 5z3 53/ 543 5o4 583 6U2-6U3 
611 619 623 643 650 697 711 729 761 
765 770 776-778 784 789 819 822 831 
839 862 866 887 904 907 921 935 962- 
963 967 973 

rectum 

Invitrogen 

REC001 

19 30 33-34 66 108-109 123-124 126 129- 
131 143 149 151 156 164 190 201 240 
247 250 263 268 274 279 287 295 298- 
299 310 314 332 341 354 384 394 401 j 

A1f\ /tO< /l/IO AAA A<Q AQ1 AQ< znf\ <C>1 
4zU 4zj 44z 440 4jy 483 48j jZO-dZi 

532 545 559 580-581 584 592 602-607 
610 612 615 619 634 637 646 655 664 
683-684 741 769 793 822 870 908-91 1 
914-916 934 937-938 942 967 973 982 

salivary gland 

Clontech 

SAL001 

16 68 74 84 121 123-124 156 172 190 203 
209 232 248 254 269 292 294 363 377 

1QQ AC\f\ ACil Af\*\-Aft& ztlH Aid AAO 

459 462 474 483 485 563-564 579 587- 
588 599 602-603 643 658 699 728 730 
737 741 748 794 822 867 876 897 903 
981 j 
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salivary gland 

Clontech 

SALs03 

217 254 270 388 610 

skin fibroblast 

ATCC 

SFB001 

517 949 

skin fibroblast 

ATCC 

SFB002 

269 688 

skin fibroblast 

ATCC 

SFB003 

3 203 897 907 

small intestine 

Clontech 

SIN001 

3-4 47 57 68-69 92 99 125-126 130-131 
135 149 151-152 156 159 185 204 241 
246 291-292 318-319 338 343 348 363 
373 375 382 388-389 392-394 397 400 
437 466-467 471 484 500 517 520-521 
525 547 560 580-581 588 599 602-603 
612 624 643 711 731 733-734 757 761 
769 774-775 794 824 864 904 906 910- 
91 1 913 948 953 959 976 984 

skeletal muscle 

Clontech 

SKMOOl 

15 75 135 146 172 190 218 267 282 308 
410 426-427 474 505 588 620 623 658 
692 713 737 779 790 862 874 878 887 
952 962-963 

skeletal muscle 

Clontech 

SKMs04 

215 

spinal cord 

Clontech 

SPCOOl 

14 20-21 25 28-29 3 1 39 46 48 59 78 83- 
84 91-92 103 112-113 135 160168 172 
176 188 190 205 209 229 232 258 285 
301 308 312-314 321 323 329 346 374 
377 380 383 388 394 398 406 409-410 
431 449-450 453 455 466-467 470-471 
484-486 488 495 497 500 503 508-509 
524 537 539558 581 586 604-605 611 
619 623 630-631 633 656 663 711 715 
729 736 740-741 761 767 769 776-778 
780 818 822 831 835-836 840 843 859 
861 871 875 887-888 897 906-907 913 
919-920 928 931 953 958 

adult spleen 

Clontech 

SPLcOl 

3 6 12-13 66 130-131 178 365 403 431 
461 558 610 715 797 809 876 947 967 

stomach 

Clontech 

STOOOl 

35 114 130-131 144 155 176 189 206-207 
249 260-262 336 382 398 425 431 453 
461 483 496 500 527 530 580 642 657 
663 669 748 765 768 802-803 839 891 
942 981 

thalamus 

Clontech 

THA002 

30-32 48 66 109 127 130-131 135 142 
145 156-158 168 172 174 185 199 224- 
225 233 246 277 282 286 293 322 332 
334 346 374 384 400 402 420 424 435- 
437 446 466-467 485 503 506 527 542 
549 572 612 615 622 624 633 643-644 
658 676 736 790 794 824 831 835 896 
907 950 969 

thymus 

Clonetech 

THMOOl 

10 16 20 28-29 32 37 41 52 57 66-67 74- 
75 HO 1 1 8 I2l 129-131 I4l I5l 159-160 
208 21 1 218 247 269 289 295 297 320 
325 354 358 365 367 372 378 388-389 
395 398 41 1-412 420 423 435 452 500 
508-509 517 524 532 537 551 558 560 
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569 577-578 582 586 598 608 61 1 622 
643 684 715 721-723 728 740 766 772- 
773 795 834 837 849 864 885 900 921 
946 948 958 962-963 965 972-973 982 

thymus 

Clontech 

THMc02 

1 3 9-1 1 16 21 27 32-34 38-39 51 55-57 
66 72 74 77-78 80 82 89-90 101 112 115 
118-119 121 123-124 126 138 144 152 
159 168 174 176 178 186-188 197 200 
208 212-214 217 225 233 243-244 246 
254 256-262 279 282 285 288-289 296- 
297 313-314 322 334 343 354-355 358- 
359 363-364 367-368 372-373 382 387- 
389 395 400 402 41 1 414 426-427 437 
440 442 449-450 454 457 462 464 469 
474 479 481 485 490-491 506 508-509 
5 1 1 5 1 7 522 526 528 532 542 55 1 554 
561-562 564 566-570 580-582 585 589 
597 599-600 602-608 61 1 613-614 619- 
621 625 628 630-631 644 646 655 669 
672 677 6,84 686-693 697 713 717 720 

109. lACi l&fs 7AQ lfS\ 7A7 771 77<\ 
/Zo /H-U /^tO /OU-/OZ /O/ //I I ID 

794 797 804 808 811 816 818-819 837 
840 859 880 883 887-888 896-897 903 
908-911 913 916 924 936 947-948 950 
962-963 965 967 970 

thyroid gland 

Clontech 

THR001 

3 8-9 14-15 19-22 28-29 39 41 55-56 66 
69 71-72 78-79 97 104-105 109 113 115 
119 121 123-124 130-133 135 138 143- 
144 146 148 151-152 156 159-163 165 
168 172 174 177 183-184 196 199-200 
203 209 21 1 215-218 228-229 232-236 
244 254-255 258 273 282 290 292 294 
297 303-306 308 311 317-318 322-323 
325-326 334-335 340 342 348 354 358 
373 377 381-382 387 394 398 401-402 
405-406 409-412 416 422 425-427 429- 
431 440 449-453 462 466-468 474 478- 
479 481-484 490 492-496 500-501 505- 
506 517-518 522-525 532 537 540-541 
545 551 558 560 563-564 580 583 587- 
589 593 597 599 606-607 610 617-621 
625-628 633 635 641-643 658-659 664- 
669 674 682 686 688-691 696 699 715 
724 730 740 742-743 747 750 752 759 
761 765-766 768-769 779 789 796 802- 
803 813 818-819 822 831 837 843 845 

RAQ RfJj BAA 8£R RAQ 871 R1A 87/^ 
0^0-0^7 oOZ Ovrt 0O0-007 O/lO / H 0 /o- 

877 887 893-894 896-897 907-909 9P 
919-921 923 925 928 936 940-942 944 
946-947 950 953 955 958-959 962-963 
967 969 973 981 

trachea 

Clontech 

TRC001 

33-34 55-56 69 74 163 172 190 209 212 
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07H 007 IfK 1\A 1^7 All A9£ A77 
ZO/ Z/U ZV / jUj JH jJZ H J j ^tZO-^f Z / 

466-467 500 502 504 580 586 610 613 
633 642 688 691 71 1 724 738-739 774 
782 816 820 839 848 862 868-869 914- 
915 928 968 

uterus 

Clontech 

UTR001 

4 9 18 37 63-64 74 108 114-115 130-131 
160 166 179 184 190 209 233 249 269 
285 301 314 327 337 348 384 394 399- 
400 403 406 41 1 425 43 1 434 437 440 
462 474 485 490 508-509 526 532 579 
617-619 636 642-643 672 761 769 793 
837 849 864 887 903 906 928 934 947 
967 


TABLE 2 


SEQ 
ID 
NO: 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 
SCORE 

% 

IDENTITY 

J 

LOo I 75 

Homo sapiens 

occurs in MHC class I region; ORF 

308 

98 

2 

Y70775 

Homo sapiens 

Follistatin-related protein zfsta. 

3094 

98 

3 

X15187 

Homo sapiens 

precursor polypeptide (AA -21 to 
782) 

4112 

100 

4 

AF1 10640 

Homo sapiens 

orphan seven-transmembrane 
receptor 

344 

100 

5 

G03798 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 7879. 

158 

72 

6 

W85607 

Homo sapiens 

Secreted protein clone da228_6. 

1477 

100 

7 

Y30162 

Homo sapiens 

Human dorsal root receptor 4 
hDRR4. 

884 

88 

8 

Y 15227 

Homo sapiens 

Leul 

391 

100 

9 

Y28817 

Homo sapiens 

pt326_4 secreted protein. 

i 3338 

100 

10 

X92106 

Homo sapiens 

bleomycin hydrolase 

2445 

100 

11 

Y15228 

Homo sapiens 

Leu2 

445 

100 

12 

U27838 

Mus musculus 

glycosyl-phosphatidyl-inositol- 
anchored protein homolog 

432 

34 

13 

U27838 

Mus musculus 

glycosyl-phosphatidyl-inositol- 
anchored protein homolog 

320 

27 

14 

Y71062 

Homo sapiens 

Human membrane transport protein, 
MTRF-7. 

2323 

99 

15 

U96781 

Homo sapiens 

Ca2+ ATPase of fast-twitch skeletal 
muscle sacroplasmic reticulum, adult 
isoform 

5145 

100 

16 

M16653 

Homo sapiens 

pancreatic elastase IEB zymogen 

1435 

99 

17 

Y13398 

Homo sapiens 

Amino acid sequence of protein 
PR0346. 

1749 

99 

18 

Y02283 

Homo sapiens 

Secreted protein clone br342 1 1 
polypeptide sequence. 

1399 

99 

19 

Y53030 

Homo sapiens 

Human secreted protein clone d24_l 
protein sequence SEQ ID NO:66. 

1371 

100 

20 

AL031320 

Homo sapiens 

dJ20N2.5 (novel protein similar to 
fiicosidase, alpha-L-1, tissue (EC 
3.2.1.51, alpha-l-fucosidase 
fucohydrolase)) 

2597 

99 

21 

B01384 

Homo sapiens 

Neuron-associated protein. 

1876 

100 

22 

Y68778 

Homo sapiens 

Amino acid sequence of a human 
phosphorylation effector PHSP-10. 

2470 

100 
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SEQ 
ID 
NO: 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 
SCORE 

% 

IDENTITY 

23 

Y55935 

Homo sapiens 

Human KHS2 protein. 

4781 

99 

24 

Y55935 

Homo sapiens 

Human KHS2 protein. 

2807 

100 

25 

AC024792 

Caenorhabditis 
elegans 

contains similarity to TR:O95029 

463 

31 

26 

Y07972 

787 

Human secreted protein fragment 

1540 

100 

27 

X97630 

Homo sapiens 

serine/threonine protein kinase 

3781 

98 

28 

AF150755 

Mus musculus 

microtubule-actin crosslinking factor 

3514 

68 

29 

AF1 50755 

Mus musculus 

microtubule-actin crosslinking factor 

3725 

70 

30 

Z38011 

Mus musculus 

DMR-N9 

2988 

86 

o I 

AJ000522 

Homo sapiens 

axonemal dynein heavy chain 

6058 

99 

32 

AF037256 

Mus musculus 

ES2 protein 

2260 

91 

33 

S62140 

Homo sapiens 

TLS=nuclear RNA-binding protein 

2917 

100 

34 

S62140 

Homo sapiens 

TLS=nuclear RNA-binding protein 

2890 

98 

36 

AB038237 

Homo sapiens 

G protein-coupled receptor C5L2 

1767 

100 

37 

D79994 

Homo sapiens 

similar to ankyrin of Chromatium 
vinosum. 

6089 

99 

38 

X63380 

Homo sapiens 

serum response factor-related protein 

1966 

99 

39 

AL022072 

Schizosacchar 
omyces pombe 

lipoic acid synthetase 

1067 

61 

40 

J03930 

Homo sapiens 

alkaline phosphatase 

2751 

100 

41 

AF132968 

Homo sapiens 

CG1-34 protein 

1088 

98 

42 

ALII 7637 

Homo sapiens 

hypothetical protein 

2208 

100 

43 

AL021393 

Homo sapiens 

bK747E2 1 (novel orotein^ 

1526 


44 

X68011 

Homo ^aniens 

ZNF81 

1886 


45 

AC002464 

Homo ^aniens 

orpanic cation fran snorter* SO% 
similarity to JC4884 (PID:g2 143892) 

2423 

inn 

46 

W78245 

Homo sapiens 

Fragment of human secreted protein 
encoded by gene 1 9. 

1949 

100 

47 

Y41765 

Homo sapiens 

Human PRO 1083 protein sequence. 

3604 

100 

48 

AF097330 

Homo sapiens 

HI chloride channel; p64H];CLIC4 

1305 

99 

50 

U09413 

Homo sapiens 

zinc fineer Drotein ZNF135 

1361 

57 

51 

AF061812 

Homo sapiens 

keratin 16 

2374 

100 

52 

W63681 

Homo sapiens 

Human secreted protein L 

1326 

99 

53 

AB035303 

Homo sapiens 

cadherin-10 

4094 

100 

54 

A 12022 

synthetic 
construct 

MRP-8 

485 

100 

55 

AL121897 

Homo sapiens 

bA392M18.3 (KIAA0180) 

1867 

100 

56 

Y73330 

Homo sapiens 

HTRM clone 397663 protein 
sequence. 

818 

96 

57 

AF151018 

Homo sapiens 

HSPC184 

955 

100 

58 

AF125042 

Homo sapiens 

bisphosphate 3 -nucleotidase 

1586 

100 

59 

AF1 18670 

Homo sapiens 

orphan G protein-coupled receptor 

1971 

100 

60 

X04494 

Homo sapiens 

precursor polypeptide 

1903 

100 

61 

AT20SS65 

Homo sapiens 

EDRF 

528 

100 

62 

D15057 

Homo sapiens 

DAD-1 

567 

100 

63 

AF260665 

Homo sapiens 

histone acetyltransferase 

1510 

100 

64 

AF260665 

Homo sapiens 

histone acetyltransferase 

1429 

96 

65 

AJ277145 

Homo sanipn^ 

raq-related small GTPase RAR1 8 

laj iviaivu Dilldll \J k x uov IvnDlO 

1071 


66 

Y94950 

Homo <ianien<i 

Human Qpprf*tf»/1 nrotpin plnnp 

dh 1 073 12 protein sequence SEQ ID 
NO: 106. 


1 Uu 

67 

Y82744 

Homo sapiens 

DNA replication and repair 
associated protein (DRASP). 

1028 

100 

68 

Y44486 

Homo sapiens 

Human GPRW receptor polypeptide. 

1721 

100 

69 

AL031228 

Homo sapiens 

dJ1033B10.2 (WD40 protein BING4 
(similar to S. cerevisiae YER082C, 
M. sexta MNG10 and C. elegans 
F28D1.1) 

3196 

100 
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SEQ 
ID 

NO: 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 

% 

IDENTITY 

70 

AJz /o_> lo 

Homo sapiens 

zinc linger proiein 

1751 

52 

71 

V | 0-7 1 A 

Y 1 5 J 14 

Homo sapiens 

parapiegin-iiKc pi uicw 

4146 

99 

72 

AF 157028 

Homo sapiens 

protein phosphatase methylesterase-1 

2017 

100 

74 

Y71U82 

Homo sapiens 

Human ts-aggressive lympiiouia 
(BAL) protein. 

1765 ! 

99 

75 

AF22542U 

Homo sapiens 

\T\r\K 
AUUZj 

734 

100 

76 

"VA CT) C 

X95235 

Homo sapiens 

transcription iacior /\rz 

217 

100 

77 

AF 108420 

Takifugu 
rubripes 

1 -aminocyciopropane-carboxi late 
synthase 

733 

56 

78 

GO 1349 

Homo sapiens 

Human secreted protein, SEQ ID 

650 

99 

79 

AL 117635 

Homo sapiens 

hypothetical protein 

922 

99 

81 

Z85986 

Homo sapiens 

aJ 1 08K 1 1 .3 (similar to yeast 
suppressor proiein oivr'tuj 


77 

82 

AF18j414 

Homo sapiens 

hemin-sensitive initiation factor 2a 
Kinase 

3231 

99 

83 

/^•A 11/11 

Homo sapiens 

Unmon co/>rpfpH r\rAtf»in ^FO TD 
HUmall SCCICLCU piULCllIj Jijy lis 

INVJ. JZZ^. 

495 

98 

84 


___ . 

Homo sapiens 

in -euiy undieiiniuc"bciioiii vc ia^iui 

3744 

99 

o c 

85 

Y 1 / fy 1 

Homo sapiens 

v pro ic hi 

1496 

100 

87 

Arzoijis 

Homo sapiens 

growxn amereiiiiaiiuii lai/tui j 

1944 

99 

oo 

88 

Y ly/j / 

Homo sapiens 

in MO 47S frnm W09999943 

1361 

100 

89 

AF161493 

Homo sapiens 

HSPC144 

1185 

100 

90 

AF161493 

Homo sapiens 


856 

100 

91 

B25780 

787 

Human secreted protein SEQ ID 

647 

41 

92 

U57344 

Mus musculus 

Meis3 

1007 
1 uu / 

07 

93 

AF1 72854 

Homo sapiens 

card iotroph in-like cytokine CLC 

1 1 07 

117/ 

98 

94 

AL390114 

Leishmania 
major 

extremely cysteine/ valine rich 
protein 

223 

29 

95 

a y\ i /■on/' 

ABO 168 86 

Arabidopsis 
thaliana 

contains similarity to adenylate 
kinase-gene id:MCA23.18 

?X7 

38 

96 

AC005525 

Homo sapiens 

F22162 1 

lOJJ 

96 

97 

B20997 

Homo sapiens 

Human nucleic acid-binding protein, 
NuABP-1. 

J OJU 

99 

98 

AJ006692 

Homo sapiens 

ultra high sulfer keratin 

507 

70 

99 

AF 172264 

Homo sapiens 

Traf2 and NCK interacting kinase, 
splice variant 1 

6Q49 

07*t-6 

99 

100 

LI 1239 

Homo sapiens 

homeobox protein 

717 

100 

101 

AC004890 

Homo sapiens 

similar to zinc finger proteins; 
similar to AAC0 1956 

(r JJJ.gZOHj III) 

91 S4 

98 

102 

AC003682 

Homo sapiens 

R28830 2 

1287 

48 

103 

AF201839 

Rattus 
norvegicus 

dynamin Illbb isoform 

4970 

95 

104 

Y79510 

Homo sapiens 

Human carDonyaraie-associaicu 

pi ULClil \sS\jjf\r-\J. 

1394 

100 

105 

Y79510 

Homo sapiens 

Human carbohydrate-associated 
proiein ^ivd/vt -o. 

1209 

90 

106 

AL096748 

Homo sapiens 

hypothetical protein 

1216 

100 

\ 108 

X97260 

Homo sapiens 

Metallothionein 2 

181 

100 

109 

AL034422 

Homo sapiens 

dJl 141E15.2 (novel protein) 

433 

100 

110 

AF191338 

Homo sapiens 

anaphase-promoting complex subunit 
4 



1 1 1 

AT 09171? 

S\JU\JA 1 / li. 

AVI aUlUUUolo 

thaliana 

nutfitive nrotein 

185 

26 

112 

AF250138 

Homo sapiens 

small stress protein-like protein 
HSP22 

1063 

100 

113 

AL109976 

Homo sapiens 

dJ794I6.Ll (novel protein) 

4176 

99 

114 

Y36151 

787 

Human secreted protein 

668 

100 
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SEQ 
ID 

NO: 

ACCESSION 
| NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 

% 

IDENTITY 

115 

AF1 10399 

Homo sapiens 

elongation factor Ts 

1666 

1 uu 

116 

AF210317 

Homo sapiens 

facilitative glucose transporter family 
member GLUT9 

2052 

99 

117 

Y73328 

Homo sapiens 

HTRM clone 082843 protein 
sequence. 

931 

100 

IIS 

X04085 

Homo sapiens 

catalase 

2846 

inn 

119 

AF147717 

Homo sapiens 

ubiquitin C-terminal hydrolase 
UCH37 

1695 

100 

120 

X73882 

Homo sapiens 

microtubule associated protein 

3801 

77 

121 

AC004882 

Homo sapiens 

similar to CAA 16821 
(PID:g3255952) 

3223 

100 

122 

M93311 

Homo sapiens 

metal lothionein-III 

421 

100 

123 

G03827 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 7908. 

557 

94 

y^ 

124 

G03827 

Homo sapiens 

Human secreted nrotein SFO TO 
NO: 7908. 

999 

jj 

125 

AF232009 

Homo sapiens 

peroxisomal trans 9-enovl Co A 
reductase 


yy 

126 

AB004906 

Ipomoea 
purpurea 

transposase 

146 

20 

127 

M60165 

Homo sapiens 

guanine nucleotide-binding 
regulatory protein 2 

1832 ' 

99 

yy 

128 

YI0319 

Homo sapiens 

carnitine carrier 

1592 

100 

129 

U75467 

Drosophila 
melanogaster 

Atu 

937 


130 

Z21507 

Homo sapiens 

human elonpation farfnr-l-Hplta 

liLHiiuii ^ i\Ji lentil i \ji i laoiui i uuila 

404 

£7 

131 

Z21507 

Homo sapiens 

human elonpation farfor-1-Hplta 

yjo 

! io,n 

132 

Y58633 

Homo sapiens 

Protein regulating gene expression 
PRGE-26 

6745 

100 

133 

Y58633 

Homo sapiens 

Protein regulating gene expression 
PRGE-26. 

4818 

95 

134 

M13692 

Homo sapiens 

alpha- 1 acid glycoprotein precursor 

1064 

99 

135 

U72970 

Sus scrofa 

calcium/calmodultn-dfnendpnt 
protein kinase II isoform gamma-B 

979^ 

yy 

136 

G03213 

Flomo sapiens 

Human secreted protein, SEQ ID 
NO: 7294. 

450 

100 

137 

AC005102 

Homo sapiens 

small inducible cytokine subfamily A 
member 24 

627 

99 

138 

AF155648 

Homo sapiens 

putative zinc finger protein 

5855 

92 

139 

AF 14463 8 

Homo sapiens 

sphingosine- 1 -phosphate lyase 

2977 

100 

140 

AF152318 

Homo sapiens 

protocadherin gamma Al 

4778 

100 

141 

B08517 

Homo sapiens 

Amino acid sequence of a beta- 
tubulin antigen. 

5841 

100 

142 

X56667 

Homo sapiens 

calretinin 

1410 

99 

143 

X92763 

Homo sapiens 

tafazzins 

1605 

100 

144 

Y95293 

Homo sapiens 

Human GEF containing NEK-like 
kinase substrate sGNK. 

4092 

99 

145 

AF226046 

Homo sapiens 

GK003 

1198 

100 

146 

M22877 

Homo sapiens 

cytochrome c 

554 

98 

147 

AJ272212 

Homo sapiens 

protein serine kinase 

2196 

100 

148 

AB026491 

Homo sapiens 

PICK1 

2114 

98 

149 

AB018580 

Homo sapiens 

hluPGFS 

1699 

100 

150 

X91868 

Homo sapiens 

sixl 

1 509 

100 

151 

AF266505 

Mus museums 

pseudouridine synthase 3 

2135 

84 

152 

U29170 

Drosophila 
melanogaster 

ANON-23D 

883 

43 

153 

G04075 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 8156. 

567 

99 ! 

154 

AY009128 

Homo sapiens | ISCU2 

138 

100 
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SEQ 
ID 

NO: 

ACCESSION 
N1IMRFR 

SPECIES 

DESCRIPTION 

SMITH- 
VVA I LRMAN 
SCORE 

/o 

IDENTITY 

I 155 

AF1413I5 

Homo sapiens 

alpha- 1,4-N- 

acetylglucosaminyltransferase 

1842 

100 

156 

AF1 10645 

Homo sapiens 

candidate tumor suppressor p33 
FNG1 homolog 

1294 

99 

157 

AF 159297 

Zea mays 

extensin-like protein 

238 

25 

158 

AL133325 

Homo sapiens 

dJ984P4.3 (Homeobox protein 
NKX2B) 

1437 

100 

159 

AF073298 

Homo sapiens 

small EDRK-rich factor 2 

| 294 

100 

160 

AC004858 

Homo sapiens 

Ul small ribonucleoprotein 1SNRP 
homolog; match to PID:g4050087 

4032 

too 

161 

AB012109 

Homo sapiens 

APC10 

990 

100 

162 

AL 162751 

Arabidopsis 
thaliana 

putative protein 

194 

32 

163 

AJ005698 

Homo sapiens 

poly(A)-specific ribonuclease 

3351 

100 

164 

AF 117646 

Homo sapiens 

long CBL-3 protein 

2547 

99 

165 

AC004002 

Homo sapiens 

similar to ciliary dyne in beta heavy 
chain; 78% Similarity to P23098 
(PID:g 118965) 

5065 

100 

166 

Ml 0942 

Homo sapiens 

human metallothionein-Ie 

381 

100 

167 

AF126484 

Homo sapiens 

CARD4 

4961 

100 

168 

AF161518 

Homo sapiens 

HSPC169 

1604 

100 

169 

M64983 

Homo sapiens 

fibrinogen beta chain 

2482 

100 

170 

M64983 

Homo sapiens 

fibrinogen beta chain 

2679 

100 

171 

M58514 

Gallus gallus 

fibrinogen beta chain 

1059 

78 

172 

AF078845 

Homo sapiens 

16.7Kd protein 

786 

100 

173 

AC004774 

Homo sapiens 

Dlx-6 

923 

100 

174 

Z98974 

Schizosacchar 
omyces pombe 

putative vacuolar protein sorting- 
associated protein 

185 

31 

175 

X56203 

Plasmodium 
falciparum 

liver stage antigen 

283 

23 

176 

W74726 

Homo sapiens 

Human secreted protein fg949_3. 

1879 

100 

177 

AJ222967 

Homo sapiens 

cystinosm 

1920 

100 

178 

AC024796 

Caenorhabditis 
elegans 

contains similarity to TR.076167 

221 

27 

179 

Y66632 

Homo sapiens 

Membrane-bound protein PR0276. 

1370 

100 

180 

AF151803 

Homo sapiens 

CGI-45 protein 

215 

28 

181 

G02694 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 6775. 

283 

100 

182 

Y17292 

Homo sapiens 

Human cell death preventing kinase 
(DPK-1) protein sequence. 

2676 

100 

183 

AF234765 

Rattus 
norvegicus 

serine-arginine-rich splicing 
regulatory protein SRRP86 

148 

27 

184 

AF151855 

Homo sapiens 

CGI-97 protein 

1214 

96 

185 

AF289664 

Mus musculus 

CYLN2 

4673 

90 

186 

AL022238 

Homo sapiens 

dJ1042K10.2 (supported by 
GENSCAN, FGENES and 
GENE WISE) 

4059 

100 

187 

AL022238 

Homo sapiens 

dJ1042K10.2 (supported by 
GENSCAN, FGENES and 
GENE WISE) 

2332 

100 

188 

X83543 

Homo sapiens 

APXL 

8513 

99 

189 

AF059569 

Homo sapiens 

actin binding protein MAYVEN 

3106 

99 

190 

M18135 

Rattus 
norvegicus 

smooth-muscle alpha tropomyosin 

1306 

95 

191 

AF242194 

Drosophila 
melanogaster 

brakeless-B 

147 

52 

192 

D30689 

Bacillus 
subtilis 

subunit of nitrite reductase 

113 

29 

193 

Y44984 

Homo sapiens 

Human epidermal protein-1. 

538 

97 


131 , 


WO 01/57190 


PCT/US01/04098 


SEQ 
ID 

NO: 

ACCESSION 
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IDLJN 1 1 J i 

194 

B25679 

Homo sapiens 

Human secreted protein sequence 
encoded by gene 1 5 SEQ ID NO:68. 

760 

100 

195 

AB020315 

787 

homologue of mouse dkk-1 gene:Acc 

1466 

100 

196 

U35730 

Mus musculus 

jerky 

2021 

75 

197 

AL136450 

Homo sapiens 

dJ5 1 0O2 1 . 1 (novel protein) 

632 

• 100 

198 

X56203 

Plasmodium 
falciparum 

liver stage antigen 

512 

24 

199 

Y70775 

Homo sapiens 

Follistarin-reJated protein zfsta. 

2027 

63 

200 

X87237 

Homo sapiens 

a-glucosidase I 

4447 

99 

201 

AF101078 

Caenorhabditis 
elegans 

CLU-1 

1393 

46 

202 

X04571 

Homo sapiens 

precursor polypeptide (AA -22 to 
1185) 

6611 

100 

203 

X00474 

Homo sapiens 

pS2 precursor 

466 

100 

204 

AB029333 

Halocynthia 
roretzi 

HrPET-1 

974 

54 

205 

AF146019 

Homo sapiens 

hepatocellular carcinoma antigen 
gene 520 

998 

100 

206 

AF071002 

Homo sapiens 

minK-related peptide 1; MiRPl 

632 

100 

207 

AB038162 

Homo sapiens 

trefoil factor 2 

744 

100 

208 

U30521 

Homo sapiens 

P311 HUM 

363 

100 

209 

AB000911 

Sus scrofa 

ribosomal protein 

782 

100 

210 

AB021227 

Homo sapiens 

mernbrane-type-5 matrix 
metalloproteinase 

3545 

100 

211 

AF 180920 

Homo sapiens 

cyclih L ania-6a 

2722 

100 

212 

AF 105365 

Homo sapiens 

K-Cl cotransporter KCC4 

5624 

100 

213 

U29244 

Caenorhabditis 
elegans 

similar to human (TRE) transforming 
protein (PIR:S22 157) 

602 

32 

214 

AL033538 

Homo sapiens 

dJ477H23.1 (novel protein) 

3195 

100 

215 

X52011 

Homo sapiens 

muscle determination factor 

1262 

100 

216 

AF083248 

Homo sapiens 

ribosomal protein L26 homolog 

739 

100 

217 

AF006751 

Homo sapiens 

ES/130 

4793 

99 

218 

AB007859 

Homo sapiens 

KIAA0399 protein 

3559 

99 

219 

AK026291 

Homo sapiens 

unnamed protein product 

826 

100 

221 

Y84045 

Homo sapiens 

Splice variant of cancer associated 
polypeptide CHI -9a 11-2. 

5851 

97 

222 

Z67996 

Homo sapiens 

tenascin-R (restrictin) 

7186 

100 

223 

AF134802 

Homo sapiens 

cofilin isoform 1 

846 

100 

224 

Y17711 

Homo sapiens 

atopy related autoantigen CALC 

1611 

99 

225 

AF 190051 

Gallus gallus 

hepatocyte nuclear factor la 
dimerization cofactor isoform 

443 

81 

226 

AK026256 

Homo sapiens 

unnamed protein product 

866 

98 

227 

Z69368 

Schizosacchar 
omyces pombe 

nuf2-like coiled-coil protein 

230 

25 

228 

AF275948 

Homo sapiens 

ABCA1 

11763 

99 

229 

AF161384 

Homo sapiens 

HSPC266 

2006 

98 

230 

Y 16270 

Homo sapiens 

paralemin 

1951 

100 

231 

AJ245599 

Homo sapiens 

putative secreted ligand 

2379 

99 

232 

W88499 

Homo sapiens 

Human stomach carcinoma clone 
HP10412-encoded protein. 

1545 

99 

233 

AF096286 

Mus musculus 

pecanex 1 

3623 

93 

234 

V64619 cd 
1 

Homo sapiens 

30-NOV-1990 Human HE1 cDNA. 

796 

100 

235 

V64619 cd 
1 

Homo sapiens 

30-NOV-1990 Human HE 1 cDNA. 

470 

98 

236 

AF227258 

Bos taurus 

RJPGR-interacting protein- 1 

1262 

38 

237 

AJ132445 

Homo sapiens 

claudin-14 

1181 

100 

238 

AL034562 

Homo sapiens 

dJ684024.2 (prodynorphin (Beta- 

1330 

100 
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Neoendorphin-Dynorphin precursor, 
Proenkephalin B precursor)) 



239 

AF262027 

Homo sapiens 

eIF-5A2 

808 

100 

240 

AL079344 

Arabidopsis 
thaliana 

putative protein 

194 

JJ 

241 

AC002394 

Homo sapiens 

Gene product with similarity to 
dynein beta subunit 

1 j4Z 

CI 

J 1 

242 

AJ271361 

Takifiigu 
rubripes 

FRANK2 protein 


1(\ 

J\J 

243 

AL021918 

Homo sapiens 

D34I8.1 (Kruppel related Zinc Finger 
protein 184) 

1476 

48 

244 

AF190167 

Homo sapiens 

membrane associated protein SLP-2 

1736 

99 

245 

Y 10601 

Homo sapiens 

ankyr in-like protein 

5877 

1 (\f\ 

246 

AL121771 

Homo sapiens 

dJ548G19.1.1 (novel protem 
(ortholog of mouse zinc finger 
protein ZFP64) (translation of cDNA 
NT2RP3001398 (Em:AK001596)) 
(isoform 1)) 

3628 

100 

247 

L25314 

Drosophila 
melanogaster 

actin-related protein 

984 

47 

248 

X63745 

Homo sapiens 

KDEL receptor 

i pick 

i fin 

249 

AF 112208 

Homo sapiens 

13kDa differentiation-associated 
protein 

510 

1 On 

250 

AP001707 I 

Homo sapiens 

human gene for claudin-8, Accession 
No. AJ250711 

1 172 

1 AH 

251 

AL136125 

Homo sapiens 

dJ304B14.1 (novel protein) 

/ lo 

IvA/ 

252 

AL031186 

Homo sapiens 

bK984Gl.l (supported by FGENES) 

<11 

jdZ 

inn 

253 

Y17531 

Homo sapiens 

Human secreted protein clone BL205 
14 protein. 

ojy 

inn 

254 

AL049843 

Homo sapiens 

dJ392M17.3 (KIAA0349 protein) 

6741 

99 

255 

AJ242972 

Homo sapiens 

TOLLIP protem 

1 AO A 

OQ 

yy 

256 

Y94873 

Homo sapiens 

Human protem clone HP02632. 

lo/O 

L\J\J 

257 

AF279865 

Homo sapiens 

kinesin-Iike protein GAKJN 

OQCi'X 
ZyKJJ 

100 

258 

AL024498 

Homo sapiens 

dJ417M14.1 (novel protein) 

^QQ 
Joy 

mo 

259 

R66278 

Homo sapiens 

Therapeutic polypeptide from 
glioblastoma cell line. 

830 

100 

260 

AF101784 

Homo sapiens 

b-TRCP variant E3RS-IkappaB 

3226 

99 

261 

AF101784 

Homo sapiens 

b-TRCP variant E3RS-IkappaB 


100 

262 

AF101784 

Homo sapiens 

b-TRCP variant E3RS-IkappaB 

"2 1 AQ 

00 

yy 

263 

AF197060 

Homo sapiens 

src homology 3 domain-containing 
protein HIP-55 

LJLjI 

100 

264 

Y86262 

Homo sapiens 

Human secreted protem HAQAK23, 
SEQ ID NO: 177. 

/Ou 

100 

265 

Y56966 

Homo sapiens 

Human borbAPL polypeptide. 

9770 
£. l ty 

100 

266 

Y56966 

Homo sapiens 

ty otitic a tvt i m . . t_ r-, . t 4- ■ Jj-L 

Human SBPSAPL polypeptide. 

\ 1018 

yy 

267 

AJ300465 

Homo sapiens 

putative white family ATP-binding 
cassette transporter 

1 ^S7 
ID J f 

95 

y-j 

268 

AC004030 

Homo sapiens 

F21856 2 

^70 

ty 

99 

269 

X55954 

Homo sapiens 

HL23 ribosomal protein 

714 

100 

270 

AB033921 

Mus musculus 

Ndrl related protein Ndr2 

i 

94 

! y^ 

271 

AF081886 

Homo sapiens 

EROl-like protein 

1 QO<I 

00 

yy 

272 

AF166492 

Homo sapiens 

small GTPase RAB6B 

1060 

100 

273 

AL022238 

Homo sapiens 

dJ1042K10.4 (novel protein) 

2201 

i on 

274 

W88667 

Homo sapiens 

Secreted protein encoded by gene 
134 clone HAIBP89. 

i <no 

99 

275 

X00129 

Homo sapiens 

precursor RBP 

1044 

97 

276 

Z47500_cdl 

Homo sapiens 

1 l-MAY-1998 Human RHOH gene 
sequence. 

1161 

100 

277 

AB049188 

Equus caballus 

ubiquitin C-terminal hydrolase 

1118 

96 


133 


WO 01/57190 


PCT/USO 1/04098 


SEQ 

ACCESSION 

SPECIES 

DESCRIPTION 

SMITH- 

% 

11/ 




Wl A TIT I> S/f 4 M 
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278 

AF270647 

Homo sapiens 

GTT1 

1564 

100 

279 

AF143956 

Mus musculus 

coronin-2 

2414 

94 

280 

R85151 

Homo sapiens 

Endothelial cell polypeptide. 

911 

92 

281 

R85151 

Homo sapiens 

Endothelial cell polypeptide. 

1031 

100 

282 

D83948 

Rattus 

SI-1 protein 

3975 

90 



norvegicus 




283 

Y14768 

Homo sapiens 

I Kappa B-like protein 

2037 

100 

286 

AL031316 

Homo sapiens 

dJ28O10.3(HSDilBl 

294 

100 




(hydroxysteroid (1 1-beta) 






dehydrogenase 1) 



. 287 

D64109 

Homo sapiens 

tob family 

1773 

99 

288 

AB026043 

Homo sapiens 

MS4A7 

1230 

100 

289 

M61866 

Homo sapiens 

Krueppel-related DNA-binding 

209 

90 




nrotein 



290 

AJ001810 

Homo sapiens 

mRNA cleavage factor 1 25 kDa 

1217 

100 




subunit 



291 

Y99454 

Homo sapiens 

Human PRO 1605 (UNQ786) amino 

694 

100 




acid sequence SEQ ID NO:395. 



292 

Y44824 

Homo sapiens 

Human molecule associated with cell 

2370 

100 




proliferation, MACP-4. 



293 

AJ276101 

Homo sapiens 

GPRC5B protein 

2099 

100 

294 

AF161406 

Homo sapiens 

HSPC288 

719 


295 

Y58628 

Homo saniens 

Protein repn latin p pgne exnrp^inn 

1276 

1 \J\J 1 




PRGE-21. 



296 

U91561 

Rattus 

pyridoxine 5'-phosphate oxidase 

1239 

87 



norvegicus 




297 

L02956 

Xenopus 

ribonucleoprotein 

1624 

83 



Iaevis 




298 

AF226730 

Homo sapiens 

Cytl9 

1729 

99 

299 

AF226730 

Homo sapiens 

Cytl9 

906 

98 

300 

Y54324 

Homo sapiens 

Amino acid ^eouence of a human 

718 

89 




gastric cancer antigen protein. 



301 

AF125533 

Homo sapiens 

NADH-cytochrome b5 reductase 

1606 

100 1 




isoform 



302 

Y32206 

Homo sapiens 

Human receptor molecule (REC) 

1676 

98 




encoded by Incyte clone 2825826. 



303 

AF247565 

Homo sapiens 

hepatocellular carcinoma associated 

525 

100 




ring finger protein 



304 

AF208844 

Homo sapiens 

BM-002 

428 

100 

305 

AC004983 

Homo sapiens 

similar to PID:g3 877944 

1988 

100 

306 

AL132978 

Arabidopsis 

putative protein 

210 

25 



thaliana 




307 

Y10530 

Homo sapiens 

olfactory receptor 

1645 

100 

308 

AF180681 

Homo sapiens 

guanine nucleotide exchange factor 

3597 

100 

309 

AF111856 

Homo sapiens 

sodium dependent phosphate 

3591 

99 




transporter isoform NaPi-3b 



310 

Y13583 

Homo sapiens 

G-protein coupled receptor 

2171 

100 

311 

Z73420 

Homo sapiens 

cE146D10.2 (mercaptopyruvate . i 

1598 

100 




sulfurtransferase (EC 2.8.1.2)) 



312 

X79535 

Homo sapiens 

beta tubulin 

2348 

100 

313 

AF070658 

Homo sapiens 

HSPC002 

861 

100 

314 

AF078866 

Homo sapiens 

SURF-4 

1395 

100 

317 

Z37986 

Homo sapiens 

phenylalkylamine binding protein 

1258 

100 

320 

AB047892 

Macaca 

hypothetical protein 

258 

82 



fascicularis 




321 

Y25755 

Homo sapiens 

Human secreted protein encoded 

1440 

100 




from gene 45. 



322 

ABO 16531 

Homo sapiens 

PEX16 | 

1741 

100 

323 

AL391141 

Arabidopsis 

putative protein 

274 

49 


134 


WO 01/57190 


PCT/US01/04098 


SEQ 
ID 

NO: 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 
SCORE 

% 

IDENTITY 



thaliana 




325 

AF 140501 

Homo sapiens 

DNA polymerase iota 

3691 

99 

326 

X96698 

Homo sapiens 

D1075-like 

1450 

96 

327 

AF1 52325 

Homo sapiens 

protocadherin gamma A5 

4769 

100 

328 

AF151803 

Homo sapiens 

CGI-45 protein 

1970 

100 

329 

X74070 

Homo sapiens 

transcription factor BTF3 

639 

81 

330 

AF171102 

Homo sapiens 

retinal degeneration B beta 

1302 

95 

331 

W54040 

Homo saniens 

Human interferon-inducible nrotein 
HIFI. 

484 

98 

332 

AF024617 

Homo saniens 

transcrintion-associated zinc ribbon 
protein 

691 

100 

333 

U19181 

Rattus 
norvegicus 

Rabin 3 

IXOvUIJ 

2129 

90 

334 

G03877 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 7958. 

621 

100 

335 

AL008582 

Homo sapiens 

bK223H9.2 (ortholog of A. thaliana 
F23F1.8) 

626 

100 

336 

AF1 10774 

Homo sapiens 

adrenal gland protein AD-00 1 

647 

100 

337 

ABO 11414 

Homo sapiens 

Kxuppel-type zinc finger protein 

1674 

58 

338 

AF207600 

Homo sapiens 

ethanolamine kinase 

129 

100 

340 

AC020579 

A rafiiHoncic 

/Til dUlUVJlJOIO 

thaliana 

nntntivp 

UULCl.il vc 

phosphoribosylformyiglycinamidine 
svnthase* ^ 5^09-29950 



341 

Y28576 

Homo sapiens 

Secreted peptide clone pe503 1 . 

944 

100 

342 

U32274 

Saccharomyce 

a UClCYlMaC 

Ydr386wp; CAI: 0.12 

191 

37 

34T 

A01771 

AUl III. 

by i linen L. 
construct 

VabCUIdT allllCOd.gUl<ilLng piOlCUl 

AGO 1 

yy 

344 


nuiiiu bdpicHb 

UI Il/ilaf a is LCI liCU llClllolUUUlCliC 
^♦"pm/nmCTpriTtnr cells nrotein 

MDS032 


ion 

345 

Y70400 

Homo sanien<; 

Human cell-sifmallinor nrotein-2 

754 

100 

346 

Y50926 

Homo sapiens 

Human fetal brain cDNA clone 
vc 16 1 derived nrotein 

962 

100 

347 

AF1 83428 

Homo sanien*; 

11VJJU1V OdLJlVliO 

28 4 kDa nrotein 

jl-Ij .i i\l.J O. yji yj i will 

1329 

100 

348 

AC006069 

Arabidopsis 
thaliana 

putative cleavage and 

nnlvarienvlation snecifitv factor 

1383 

55 

349 

AL032631 

Caenorhabditis 
elepans 

Y106G6H.8 

194 

39 

350 

U70669 

Homo saniens 

Fas-lipand associated factor 3 

167 

23 

351 

Y93468 

Homo sapiens 

Amino acid seauence of a Dotassium 
channel interactor protein. 

1182 

92 

352 

AF005856 

Drosophila 
yakuba 

anon2A5 

111 

45 

353 

AJ271684 

Homo sapiens 

myeloid DAPI2-associating lectin 

1013 

100 

354 

AF099100 

Homo sapiens 

WD-repeat protein 6 

2882 

99 

355 

U51730 

Murine 

1V1U1 LI 1 W 

leukemia virus 

reverse transerintase 

316 

42 

356 

D50617 

S accharomy ce 
s cerevisiae 

YFL042C 

279 

27 

357 

D50617 

Saccharomyce 
s cerevisiae 

YFL042C 

279 

27 

358 

AF161432 

Homo sapiens 

HSPC314 

1059 

93 

359 

AB029488 

Homo sapiens 

CllorGl 

758 

99 

360 

AJ251024 

Homo sapiens 

putative odorant binding protein ag 

1239 

100 

361 

U43281 

Saccharomyce 
s cerevisiae 

Lpg22p 

2074 

74 

362 

U43281 

Saccharomyce 
s cerevisiae 

Lpg22p 

2153 

74 
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363 

AC007153 

Arabidopsis 

100632 

156 

24 



thaliana 




364 

AF 197927 

Homo sapiens 

AF5q3 1 protein 

3992 

99 

365 

D28500 

Homo sapiens 

mitochondrial isoleucine tRNA 

4286 

98 




synthetase 



366 

X97868 

Homo sapiens 

arylsulphatase 

3141 

98 

367 

AL 162048 

Homo sapiens 

hypothetical protein 

1532 

100 

368 

L36062 

Mus musculus 

steroidogenic acute regulatory 

189 

25 




protein 



369 

AF1 13249 

Homo sapiens 

multiple domain putative nuclear 

1022 

59 




protein 



370 

M15888 

Bos taurus 

endozepine-related protein precursor 

2425 

84 

371 

X66363 

Homo sapiens 

serine/threonine protein kinase 

2562 

100 

372 

W74802 

Homo sapiens 

Human secreted protein encoded by 

1532 

89 




gene 73 clone HSQEL25. 



373 

AF1 00772 

Homo sapiens 

tenascin-Ml 

11535 

99. 

374 

. AF090934 

Homo sapiens 

PRO0518 

382 

100 

375 

AB021643 

Homo sapiens 

gonadotropin inducible transcription 

2761 

99 




repressor-3 



376 

AB049758 

Homo sapiens 

MA WD binding protein 

1331 

100 

377 

AF070666 

Homo sapiens 

Kruppel-associated box protein 

466 

97 

378 

S59342 

Mus sp. 

nuclear pore complex glycoprotein 

464 

60 




p62 



379 

AF149205 

Mus musculus 

Su(var)3-9 homolog Suv39h2 

1690 

88 

380 

AF227906 

Homo sapiens 

UDP-glucose:glycoprotein 

7851 

99 




glucosy [transferase 2 precursor 



381 

AF 118566 

Mus musculus 

hematopoietic zinc finger protein 

1769 

92 

382 

AK000619 

Homo ^aniens 

iinnnmpd nrotpin nrnHnct 

RIO 

OIU 

ion 

1 V/Vi 

383 

AF227906 

Homo sapiens 

UDP-glucose:gIycoprotein 

7851 

99 




glucosyltransferase 2 precursor 



384 

AF 117946 

Homo sapiens 

Link guanine nucleotide exchange 

2363 

100 




factor II 



385 

AF125390 

Drosophila 

L82G 

139 

41 



melanogaster 




386 

Y94907 

Homo sapiens 

Human secreted protein clone 

1092 

50 




ca 1 06 1 9x protein sequence SEQ ED 






NO:20\ 



387 

U 18795 

Saccharomyce 

Yel064cp 

206 

28 



s cerevisiae 




388 

AF 177388 

Homo sapiens 

cancer-amplified transcriptional 

10748 

99 




coactivator ASC-2 



389 

AJ002744 

Homo sapiens 

UDP-GalNAc:polypeptide N- 

3469 

96 




acetylgalactosaminyltransferase 7 



390 

AF097366 

Homo sapiens 

cone sodium-calcium potassium 

3166 

100 i 




exchanger 



391 

AF2 17525 

Homo sapiens 

Down syndrome cell adhesion 

5337 

60 




molecule 



392 

U81035 

Rattus 

ankyrin binding cell adhesion 

3967 

91 



norvegicus 

molecule neurofascin 



393 

X65224 

Gallus gallus 

neurofascin 

4097 

78 

394 

X13916 

Homo sapiens 

LDL-receptor related precursor (AA 

4292 

99 




-19 to 4525) 



395 

AF151083 

Homo sapiens 

HSPC249 

444 

98 

396 

AB017026 

Mus musculus 

oxysterol-binding protein 

2173 

98 

397 

AL035587 

Homo sapiens 

dJ475N16.4 (KIAA0240) 

2393 

100 

398 

W74813 

Homo sapiens 

Human secreted protein encoded by 

722 

92 




gene 85 clone HSDFV29. 



399 

Y71U0 

Homo sapiens 

Human Hydrolase protein-8 

1637 

99 




(HYDRL-8). 
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SEQ 
ID 

NO: 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 
SCORE 

% 

IDENTITY 

400 

AF039718 

OaenorhaHHitis 

Vw/ CI V- 1 J KJ X liUUU I t. i «_> 

elegans 

contains similarity to Iudus LA 
protein homologs 

325 

43 


AE000877 

Mpthanothfrm 

IVlV/UlCUlV/lllV/1 ill 

obacter 

thermoautotro 

phicus 

ronsprvpd nrotein 

231 

36 

402 

Y27795 

Homo sapiens 

Human secreted protein encoded by 
gene No. 79. 

1539 

99 

403 

Z50853 

Homo sapiens 

CLPP 

615 

100 

405 

X03475 

Rattus 
norvegicus 

ribosomal protein L35a (aa 1-1 10) 

576 

99 

406 

AF 144237 

Homo sapiens 

LOMP protein 

252 

44 

407 

U20239 

Mus musculus 

fibrosin 

288 

76 

409 

AL033378 

Homo sapiens 

(1J323M4.1 (KIAA0790 protein) 

6026 

99 

410 

X54326 

Homo sapiens 

glutaminyl-tRNA synthetase 

7577 

99 

411 

X61585 

Bos taurus 

polynucleotide adenylyltransferase 

3715 

97 

412 

AF217190 

Homo sapiens 

MLEL1 protein 

5271 

99 

414 

G02815 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 6896. 

314 

95 

415 

AJ245922 

Homo sapiens 

alpha-tubulin 8 

2370 

100 

416 

AF203032 

Homo sapiens 

neurofilament protein 

220 

21 

417 

Z97653 

Homo sapiens 

c3 80 A 1.2.1 (novel protein (isoform 
1)) 

1567 

100 

418 

AJ404326 

Homo sapiens 

SR+89 

1871 

99 

419 

AJ404326 

Homo sapiens 

SR+89 

902 

64 

420 

AF 134726 

Homo sapiens 

G9A 

5334 

99 

421 

L28125 

Podospora 
anserina 

beta transducin-like protein 

288 

39 

422 

W21733 

Homo sapiens 

NIP-1 encoded by clone 59. 

110 

72 

423 

S67970 

Homo saniens 

ZNF75=KRAB zinc finger 

951 

76 

424 

L28035 

lVfn*? musculus 

ItIUO HlUJvUllt J 

protein kinase C gamma 

3768 

98 

426 

Y73373 

Homo ^aniens 

HTRM clone 921803 protein 
sequence. 

555 

56 

427 

Y73373 

Homo sapiens 

HTRM clone 921803 protein 
sequence. 

266 

49 

428 

X61118 

Homo sapiens 

TTG-2a/RBTN-2a 

876 

100 

429 

Z96932 

Homo sapiens 

nuclear autoantigen fo 14 kDa 

496 

83 

430 

AJ277291 

Homo sapiens 

HELG protein 

678 

72 

431 

X82157 

Homo sapiens 

hevin 

3525 

99 

432 

AC007192 

Homo sapiens 

P85B HUMAN; PTDINS-3- 
KINASE P85-BETA 

3825 

99 

433 

AL021918 

Homo sapiens 

b34I8.1 (Kruppel related Zinc Finger 
protein 184) 

1713 

50 

434 

AF084464 

Rattus 
norvegicus 

GTP-binding protein REM2 

141 

29 

435 

AL049795 

Homo sapiens 

dJ622L5.2 (novel protein) 

1756 

98 

436 

M14513 

Rattus 
norvegicus 

(Na+ and K+) ATPase, alpha(III) 
catalytic subunit 

4269 

99 

437 

U33460 

Homo sapiens 

DNA-directed RNA polymerase I, 
largest subunit 

8777 

98 

438 

D87076 

Homo sapiens 

similar to human bromodomain 
protein BR140(JC2069) 

3067 

100 

439 

L43912 

Macaca 
mulatta 

mannose-binding protein A 

589 

93 

440 

D31763 

Homo sapiens 

ha0946 protein is Kruppel-related. 

927 

49 

441 

U70976 

Homo sapiens 

arrestin 

2068 

99 

442 

B08069 

Homo sapiens 

A human beta-alanine-pyruvate 
aminotransferase (HAPA). 

2343 

99 

443 

AF1 00662 

Caenorhabditis 

contains similarity to ubiquitin 

166 

24 
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SEQ 

ACCESSION 

SPECIES 

DESCRIPTION 

SMITH- 

% 

ID 

NUMBER 



WATERMAN 

IDENTITY 

NO* 




otUKt 




eleoans 

carhoxvl-term inal hvdrolase fPfanv 






UCH-I hmm score- 28 461 fPfanv 






UCH-2.hram, score: 47.53) 



444 

D78017 

Rattus 

NFI-A1 

2667 

70 



norvegicus 




445 

AL049569 

Homo sapiens 

dJ37C10.3 (novel ATPase) 

2418 

100 

448 

AJ242540 

Vol vox carteri 

hydroxyproline-rich glycoprotein 

165 

34 



f. nagariensis 

DZ-HRGP 



449 

AJ133352 

Homo sapiens 

ZNF237 protein 

2006 

100 [ 

450 

AJ133352 

Homo sapiens 

ZNF237 protein 

1025 

96 

451 

AF 170708 

Homo sapiens 

T-box protein TBX3 

3700 

99 

452 

AK002080 

Homo sapiens 

unnamed protein product 

1546 

99 

453 

L32977 

Homo sapiens 

Rieske Fe-S protein 

1239 

93 

454 

X51760 

Homo sapiens 

zinc finger protein (583 AA) 

1533 

57 

455 

Y01 141 

Homo ^aniens 

Secreted nrotein encoded hv pen? 7 


QQ 




clone HTLFA90 

VIV/11W All X-J X A*.*r\J» 



456 

AB00663 1 

Homo sanierK 

The human hnmnlno c%f mnncp f^n y-9 


inn 

457 

AF067165 

Homo Q/inipnc 

tsliiKs llllgCJ plULCLLl J 

077 

64 

458 

AF01R16Q 

xioiiio sapiens 

unKnown 

I D4 

i<> 

Hjy 

W /3/J4 

Homo sapiens 

Human secreted protein encoded by 

1 180 

95 




gene iy clone nKoMLoy. 



a 6n 

HOU 

I IQ7OO0 
Uy /UK) A 

— — . — 

Caenorhabditis 

similar to acyl-CoA dehydrogenases 

CO 

583 

37 



elegans 

and epoxide hydrolases; Pfam 






aomain Jrruu^Hi {/vcyi-coA on), 






ocore — d i .h, tr- value— i . /e- 10, in— z, 






PAntciiTtP cimimnH/ trt Prim ^hmiin 

CUIllainS SmUIaTliy lO i iaill aojjiain 






PFn070? n-TvHrnlacp^ ^rnrf»=57 4 






L» ValUv IV i«7> in » 



461 

AK023 114 

Homo ^aniens 

nnn?imprl nrotptn nrnHnrt 

1 04 1 

1 U^T 1 


462 

M93 1 34 

FriPnH mnrinp 

l Jl^lIU 1HU1 lllK, 

nnl nrntpin 


44 



leukemia virus 

1VUIWI111U ▼ Li U<J 




463 

AF055473 

Homo ^aniens 


91? 

47 

466 

Y51415 

Homo <?anien<; 

T-Tnman wilrl tvnp nl<f pRl nrnfpin 
■i luuiau VY 11 Li ijfL/C |yl\.s<Oj LrllHVsLll. 

969 S 

inn 

467 

Y51417 

787 
/ o / 

ri ui i rail pivco j spuLc vdriaiiL proicm 


i fin 

1UU 

468 

Y57936 

Hnmn canipnc 

I1UI1IU OdjJlCUb 

JTlUillaU LI cUlMlltlllUlallC pr ULC1U 

1 690 

06 




n i ivj_r JN"OV. 



469 

D38552 

Hnmn Qarupnc 

TTip fin 1 rMTtfp in ic Tptat/^H 
llic llal piUlClll lo 1 ClalCU 

900S 

Z!77 J 

i nn 




cyciopninn. 



470 

Y700 1 3 

1 / \J\J X J 


nuiildu riUlcaoC al 1U abSOCiaicu 


i no 

1UU 




r»rntpin-7 ^PPR^t-7^ 
jji uiciii- / jrrvvj - / 



471 

AJ924747 

Hnmn cannpnc 

1HJH1U oaJJlCJl^> 

r^^tprmin^l variant r*FhTNI A riT 

7060 

inn 

1UU 




inpliiHitiO' 7 m inn apirJ **vr'han<yf»c 
UlwUUJllg Z. aJlililu oV/lVi CAClJcUl^tyb 






anH ori inQerrir\ti n"F7R nminn ariHc in 
(Uiu ail iMowiui/ii VI £tO aiiiiiiLi aLiUj ill 






frame. 



472 

W99665 

Homo saniens 

Human ^ecrpteH nrotein rlnnp 

1 S46 

ion 




dul57 12 Drotein 

UUl J # lii U1U1V11I. 



473 

W99665 

Homo saniens 

Human secreted nrotein clone 

998 

770 

98 




dul57 12 protein. 



474 

X63526 

Homo sapiens 

homologue to elongation factor 1- 

2273 

99 




gamma from A.salina 



475 

XI 5940 

Homo sapiens 

ribosomal protein L31 (AA 1-125) 

644 

100 

476 

M60832 

Homo <?anipns 

alnha-9 rvne \/TTT cnllaapn 

J JO 1 

QQ 

477 

AF039697 

Homo <;anien^ 

antigen NY-CO-1 1 

191^ 

Q7 

478 

AF1 56929 

Sus scrofa 

inflammatory response protein 6 

1588 

83 

479 

AF264717 

Homo sapiens 

FYVE domain-containing dual 

5610 

99 




specificity protein phosphatase 






FYVE-DSP2 



480 

AF044578 

Homo sapiens 

putative DNA polymerase; POMP 

2478 

94 

481 

X89750 

Homo sapiens 

TGIF protein 

1413 

100 
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NO: 
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SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 

% 

IDENTITY 


My j l u / 

Homo sapiens 

— — — 

(R)-3 -hydroxy butyrate 

dehydrogenase 


yo 

AStl 
HOJ 


tiomo sapiens 

Rhn/11RP7 
D0p/3-)JjrZ 

1556 

4 1 

*T 1 

4 OH 

Ar 131 3J6 

Homo sapiens 

fleoxycyiiayi iransierose, rvcvip 

4281 

yy 

485 

Z98884 

Homo sapiens 

dJ467Ll.i (KIAA0833) 

699 

73 


AJZ4jo /4 

Homo sapiens 

ol igophren in-4 


mo 

1 Uu 

487 

Z11737 

Homo sapiens 

flavin-containing monooxygenase 4 

2969 

100 

A OO 

488 

AJOlZi 

Mus musculus 

talin 


77 

/i on 

AJZ /ol lZ 

Homo sapiens 

putative cell cycle control protein 

JJJ 

91 
Zj 

/I OA 

W /4o4j 

Homo sapiens 

Human secreted protein encoded by 
gene 1 i j cione huvdaUj. 

IV/ 1 J 

OR 

491 

Y41337 

Homo sapiens 

Human secreted protein encoded by 
gene 3U clone rLKi^jJV4 / . 

509 

36 

492 

X90530 

Homo sapiens 

ragB 

1926 

99 

493 

X9053U 

Homo sapiens 

ragB 

izin^ 

00 

yy 

AC\ A 

494 

X90530 

Homo sapiens 

ragB 


yO 

495 

AL022394 

Homo sapiens 

dJ51 1B24.3 (KIAA0395 (probable 
homeobox protein)) 

4990 

99 

A C\C 

496 

■vi line 

Yl 1395 

Homo sapiens 

lanthionine synthetase C-like protein 
1 

7 1 £R 
/loo 

inn 

1UU 

497 

AJ0101 19 

Homo sapiens 

Rjbosomal protein Kinase d {j\ojs.-d) 


i no 

498 

GO 1563 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 5644. 

330 

100 

499 

X54131 

Homo sapiens 

protein-tyrosine phosphatase 

10465 

99 

500 

GO 1082 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 5163. 

549 

100 

501 

AC004142 

Homo sapiens 

similar to murine leucine-rich repeat 
protein; possible role in neural 
development by protein-protein 
interactions; 93% similarity to 
D49802 (PID:gl369906) 

3676 

100 

502 

AL 117544 

Homo sapiens 

hypothetical protein 

1226 

100 

503 

AF203032 

Homo sapiens 

neurofilament protein 

5115 

99 

504 

AL034417 

Homo sapiens 

bK215Dl 1.2 (similar to rat gene 33) 

2476 

100 

505 

X69090 

Homo sapiens 

190kD protein 

7546 

99 

506 

U58755 

Caenorhabditis 
elegans 

coded for by C. elegans cDNA 
yk34bl.5; coded for by C. elegans 
cDNA ykl3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded for 
by C. elegans cDNA yk46d5.5; 
coded for by C. elegans cDNA 
yk43c2.5; coded for by C. elegans 
cDNA yk46e8.3; coded for by C. 
elegans cDNA yk43c2.3; coded tor 
by C. elegans cDNA yk46d5.3; 
coded tor by C elegans cDNA 
ykl3f!0.3; coded for by C. elegans 

CDiNA VKJ4D1.J 

782 

55 

CAT 

507 

Ajzyiiuy 

Homo sapiens 

NhLrz protein 

ani 

5U1 

inn 

508 

T T1 AA/I C 

Rattus 
norvegicus 

cytoplasmic dynein intermediate 
chain zr> 


07 

y 1 

509 

AF063231 

Mus musculus 

cytoplasmic dynein intermediate 
cnain z 

3159 

97 

<t A 



JWlJL. 1 u 

4336 

95 

511 

Y13115 

Homo sapiens 

serine/threonine protein kinase 

5071 

99 

512 

AB030207 

Homo sapiens 

G gamma subunit 

364 

100 

513 

AF039571 

Homo sapiens 

peripheral benzodiazepine receptor 
interacting protein; PBR-IP/PRAX1 

495 

33 

514 

AB037883 

Homo sapiens 

Gb3/CD77 synthase 

1916 

99 
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SEQ 
ID 

NO: 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 

Wf A TCD Jl/f A JVT 

WA I fcKIVlAiN 
SCORE 

% 

IDENTITY 

515 

D90868 

Escherichia 
coli 

similar to 

| 1489 

100 

516 

X98834 

Homo sapiens 

zinc finger protein Hsal2 

5290 

100 

517 

AF055668 

Mus musculus 

apoptosis-linked gene 4, deltaC form 

2904 

78 

518 

AFO 19926 

Mus musculus 

protein kinase 

1694 

90 

519 

M34513 

Homo sapiens 

omega protein 

317 

91 

520 

Y08612 

Homo sapiens 

88kDa nuclear pore complex protein 

2313 

99 

521 

Y08612 

Homo sapiens 

88kDa nuclear pore complex protein 

1561 

99 

522 

AL096766 

Homo sapiens 

dA59Hl 8. 1 (KIAA0767 protein) 

2497 

100 

523 

AF 186249 

Homo sapiens 

six transmembrane epithelial antigen 
of prostate 

1790 

100 

524 

AB0290I2 

Homo sapiens 

KIAA1089 protein 

4933 

100 

525 

AB026893 

Homo sapiens 

vascular cadherin-2 

5962 

100 

526 

X74331 

Homo sapiens 

DNA primase (p58 subunit) 

1720 

100 

528 

AC007228 

Homo sapiens 

R31665 2 

1488 

47 

529 

XI4830 

Homo sapiens 

acetylcholine receptor beta-subunit 
preprotein 

2639 

100 

530 

U80446 

Caenorhabditis 
elegans 

coded for by C. elegans cDNA 
ykl72e6.3; coded for by C. elegans 
cDNA yk 15817.3; coded for by C. 
elegans cDNA yk!58f7.5; coded for 
by C elegans cDNA ykl72e6.5 

420 

39 

531 

S76838 

Mus sp. 

Dbs 

4821 

88 

532 

Z82215 

Homo sapiens 

dJ6802.2 (myosin, heavy 
polypeptide 9, non-muscle) 

9828 

100 

533 

AF245505 

Homo sapiens 

adlican 

277 

31 

534 

AF300612 

Homo sapiens 

N-acetylgalactosamine-4-O- 
sulfotransferase 

993 

59 

535 

AL121928 

Homo sapiens 

bA 181 14.3 (pleckstrin and Sec7 
domain protein) 

3333 

99 

536 

AJ271055 

Mus musculus 

iroquois homeobox protein 6 

1724 

76 

537 

AF 180473 

Homo sapiens 

Not2p 

2267 

100 

538 

AF071059 

Mus musculus 

zinc finger RNA binding protein 

1089 

. 51 

539 

AF023453 

Homo sapiens 

actin-related protein 3-beta 

2219 

100 

540 

AC003030 

Homo sapiens 

R29828 1 

1401 

70 

541 

AC003030 

Homo sapiens 

R29828 1 

2294 

100 

542 

AL121889 

Homo sapiens 

dJ1076E17.1 (KIAA0823 protein 
(continues in AL023803)) 

2152 

100 

543 

ABO06135 

Rattus 
norvegicus 

db83 

1238 

98 

544 

G02650 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 6731. 

644 

97 

545 

Y07595 

Homo sapiens 

transcription factor TFIIH 

2373 

100 

546 

AL 133545 

Homo sapiens 

bA386N14.1 (novel protein similar 
to a dual specificity phosphatase) 

964 

99 

547 

X83618 

Homo sapiens 

hydroxymethylglutaryl-CoA 
synthase 

2647 

100 

548 

AF134726 

Homo sapiens 

NG37 

4359 

99 

549 

AB035356 

Homo sapiens 

neurexin I-alpha protein 

6948 

99 

551 

AB037901 

Homo sapiens 

gene amplified in squamous cell 
carcinoma- 1 

5215 

99 

552 

AB043634 

Homo sapiens 

PAR-6A 

885 

100 

553 

AP000693 

Homo sapiens 

partial CDS 

4875 

99 

554 

AF002223 

Homo sapiens 

myotubularin related 1 

3490 

100 

555 

AC004893 

Homo sapiens 

similar to NEDD-4 (KIA0093); 
similar to P46934 (PIDrgl 171682) 

1611 

100 

556 

AJ404468 

Homo sapiens 

axonemal dynein heavy chain 

8328 

100 

557 

AJ404468 

Homo sapiens 

axonemal dynein heavy chain 

11137 

100 
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SEQ 
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NO: 

ACCESSION 
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SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 
SCORE 

% 

IDENTITY 

558 

X65873 

Homo sapiens 

kinesin heavy chain 

4860 

100 

559 

AJ277365 

Homo sapiens 

polyglutarnine-containing protein 

592 

36 

560 

AF205600 

Homo sapiens 

transposase-like protein 

407 

27 

561 

' X71125 

Homo sapiens 

glutaminyl-peptide cyclotransferase 

1914 

100 

562 

X71125 

Homo sapiens 

glutaminyl -peptide cyclotransferase 

1456 

97 

563 

X54304 

Homo sapiens 

myosin regulatory light chain 

897 

100 

564 

AF250842 

Drosophila 
melano^astpr 

multiple asters 

130 

23 

565 

Y58608 

Homo sapiens 

Protein regulating gene expression 

l lv VJ L-i 1. 

1619 

99 

566 

AT 121893 

Homo sanipns 

hA 1 8QTC91 5 AiovpI nrotpin similar 

U/\ 1 07Xvt> 1 .J lllUV^l \J J. \J I U. 1 DIHJtlCll 

to retinoblastoma binding orotein 
(RBBP9)) 

1012 

100 

567 

AL 117352 

Homo sapiens 

dJ876B10.2 (novel protein (ortholog 
ofratEX084)) 

3713 

99 

568 

AF228603 

Homo sapiens 

pleckstrin 2 

1841 

100 

569 

AF239243 

Homo sapiens 

histone deacetylase 7 

3244 

86 

570 

AF087695 

Mus musculus 

veli 3 

989 

100 

571 

AB046381 

Homo sapiens 

testis-abundant finger protein 

1346 

99 

572 

AC00555 1 

Homo saniens 

R26529 2 nartialCDS 

1020 

100 

573 

Y90290 

Homo sapiens 

Human peptidase, HPEP-7 protein 

SPOiipncp 

274 

52 

574 

W76734 

llVJlliU oaUlCllD 

Human mDia Rho tarpptinp orotein 

Hull-Lull IIILSICI LXllKJ tell 111 Ig yJl uLl/ 111. 

712 

32 

575 

AT 191915 


hAS17H? ^ ff-romnlpx 10 fa murine 

UrtJI 1 W£,.J \\. L»\JllljL>lv>A 1 \J \CL 111 111 Ulv 

ten hnmolo?^ 

853 

78 

576 

Y86217 

Homn saniens 

Human secreted Drotein HWHGU54 
SEQ ID NO: 132. 

2123 

99 

577 

AL121716 

Homo sapiens 

dJ202D23.2 (novel protein) 

6329 

99 

578 

AL121716 

Homo saniens 

dJ202D23.2 (novel protein) 

6329 

99 

579 

X92715 

Homo ^aniftns 

KRAB /C2H2 zinc fineer orotein 

3102 

97 

580 

X54637 

Homo ^aniens 

protein tyrosine kinase 

5564 

98 

581 

X78817 

Homo sapiens 

pi 15 

1148 

44 

582 

A J2 5 1245 

Rarhis 
norvegicus 

hinHina nrotein 7 

3086 

71 

583 

AF1 13125 

Homo sanipns 

P-l pnzvmp 

l-J 1 &ll£jjrlU\> 

581 

100 

584 

Ml 9529 

Sus srrofa 

foil i statin A 

lAJllLolCIllll 1%. 

1906 

98 

585 

AF169677 

Homo sapiens 

leucine-rich repeat transmembrane 
nrotein FLRT3 

3403 

100 

586 

D87685 

Homo sapiens 

similar to human transcription factor 
TFIIS fS34159^ 

X X Ilk/ l i 1 *J s #♦ 

8083 

99 

587 

Y00876 

Homo sapiens 

Human LAPH-1 protein sequence. 

2110 

100 

588 

Y99674 

Homo saniens 

Human OTPase associated nrotein- 
25. 

2111 

99 

589 

D86973 

Homo saniens 

similar to Yeast translation activator 

oiiniltii IU x vCul u uiijiuuv/ii avuvutui 

GCN1 (P1:A48126) 

12033 

99 

590 

AL034452 

Homo sapiens 

dJ682J15.1 (novel Collagen triple 
helix repeat containing protein) 

1979 

100 

591 

Y57396 

Homo sapiens 

Human lysoenzyme LYC4 
polypeptide. 

814 

100 

592 

AJ297743 

Mus musculus 

torsinB protein 

1448 

85 

593 

AF 164796 

Homo saniens 

NAfyf-T'iihiniiinone oxidorpdnctasp 

MLRQ subunit homolog 

469 

100 

594 

Y41T1? 

llvJUiU odjJlCllo 

Human cf^rrptpH nrntpin Pt\cf\rif*A bv 

l ltllllu.ll i)Cv«l CICU. JJlL/lV/111 C11LUUCU \Jj 

gene 5 clone HLDRM43. 

749 

94 

595 

Y41312 

Homo sapiens 

Human secreted protein encoded by 
gene 5 clone HLDRM43. 

824 

100 

596 

Y77123 

Homo sapiens 

Human neurotransmission-associated 
protein (NTAP) 998868. 

2102 

98 

597 

AF2 15703 

Drosophila 

KISMET-L long isoform 

1880 

65 
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Ml [lURPR 
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SMITH- 
WATERMAN 
SCORE 

% 

IDENTITY 



melanogaster 




598 

AF070447 

Homo sapiens 

barrier-to-autointegration factor 

290 

90 

599 

X56203 

Plasmodium 
falciparum 

liver stage antigen 

372 

22 

600 

X79828 

Mus musculus 

NK10 

202 

53 

601 

AB004109 

Cricetulus 
griseus 

phosphatidylserine synthase II 

2262 

92 

602 

U94988 

Mus musculus 

Nulpl 

2912 

89 

603 

U9498S 

Mus musculus 

Nulpl 

2800 

86 

604 

AF006264 

Homo sapiens 

recombination and sister chromatid 
cohesion protein homolog 

2850 

100 

605 

AF006264 

Homo sapiens 

recombination and sister chromatid 
cohesion protein homolog 

2530 

100 

606 

X82260 

Homo sapiens 

RanGAPl 

2929 

100 

607 

X82260 

Homo sapiens 

RanGAPl 

1843 

97 

608 

AF 160909 

Drosophila 
melanogaster 

BcDNA.LD03471 

943 

58 

610 

X74801 

Homo sapiens 

gamma subunit of CCT chaperonin 

2745 

99 

611 

AL03I427 

Homo sapiens 

dJl 67 A 19.1 (novel protein) 

1608 

100 

612 

Y71072 

Homo sapiens 

Human membrane transport protein, 
MTRP-17. 

445 

100 

613 

XI 6396 

Homo sapiens 

precursor polypeptide (A A -29 to 
315) 

1749 

100 

614 

AK000281 

Homo sapiens 

unnamed protein product 

1814 

99 

615 

AB0U128 

Homo sapiens 

KIAA0556 protein 

5761 

99 

616 

U19361 

Petromyzon 
marinus 

NF-180 

205 

21 

617 

AF045555 

Homo sapiens 

wbscrl 

1208 

100 

618 

AF045555 

Homo sapiens 

wbscrl alternative spliced product 

1318 

100 

619 

U22229 

Felis catus 

ribosomal protein L4 1 

128 

100 

620 

Y17169 

Homo sapiens 

A6 related protein 

1819 

100 

621 

Y 12065 

Homo sapiens 

hNop56 

2956 

99 

622 

AF177758 

Homo sapiens 

ubiquitin specific protease 16 

2998 

100 

623 

AF3 17425 

Homo sapiens 

GAC-1 

3866 

100 

624 

AL050297 

Homo sapiens 

hypothetical protein 

1227 

99 

625 

AC007204 

Homo sapiens 

BC273239 1 

3398 

99 

626 

Z68747 

Homo sapiens 

imogen 38 

2024 

99 

627 

Z68747 

Homo sapiens 

imogen 38 

1958 

97 

628 

Y70229 

Homo sapiens 

Human RNA-associated protein- 10 
(RNAAP-10). 

3424 

99 

629 

AF191492 

Homo sapiens 

nasopharyngeal carcinoma associated 
gene protein-8 

613 

100 

630 

AF 119664 

Homo sapiens 

transcriptional regulator protein 
HCNGP 

1574 

100 

631 

AF 119664 

Homo sapiens 

transcriptional regulator protein 
HCNGP 

1150 

89 

632 

Y17849 

Homo sapiens 

ganglioside-induced differentiation 
associated protein 1 

1839 

98 

633 

X55740 

Homo sapiens 

5-nucleotidase 

3012 

100 

634 

AF039688 

Homo sapiens 

antigen NY-CO-3 

931 | 

100 

635 

AF 119662 

Homo sapiens 

E46 protein 

2424 

100 

636 

AB007836 

Homo sapiens 

Hic-5 

2544 

100 

637 

AF077818 

Mus musculus 

syntrophin-associated serine- 
threonine protein kinase 

2027 

44 

638 

AL035455 

Homo sapiens 

dJ1018E9.l (VAMP (vesicle- 
associated membrane protein)- 
associated protein B and C) 

150 

26 

639 

AF078844 

Homo sapiens 

hqp0376 protein 

416 

81 
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SMITH- 

% 

ID 

NUMBER 



WATERMAN 

IDENTITY 

NO: 






640 

U28377 

Escherichia 

ORF f2 39- was ORF fl91 and 

1 198 

100 



coli 

ORF fl 94 before splice 



641 

AK024442 

Homo sapiens 

FLJ00032 nrotein 

1677 

56 

642 

U58682 

Homo saniens 

rihosnmal nrntpin S9R 

l 1UVIJUI11UI Ul \J Iv 111 iJA (J 

340 

100 

Ivv 

643 

X57432 

Raftus rafrns 

rihnsnmsil nrntpin ^9 

i I Uv7oUl 1 1 til Ul V/ twill 04 

1520 


644 

AB002348 

Hnmn sanipn<s 

K1AA01S0 nrotein 

iVI/i/VV J«7 v/ pivLClll 

5186 


646 

Y96202 


1 1/ - o i~k o k-< l/'inQCP l Tic 1c 1 rtin/iinn 
LlvappaO tvillaaC ^liVXvy UUIUUlg, 

1 1 78 

70 




nmtein Y9HS6 



647 

AB029482 

Miiq miispjiinc 
i'IUj liiuoisuiuo 

TNK-hindinp nrotein rWKRPI 

jinjv uuiuLiig, piuteui JI^lPv_Ljr 1 

4609 

SI 
o i 

648 

AB009053 

Arahidonsis 

contains similaritv to isoamvl 

407 

44 



thaliana 
tiiallalla 

^pptntp-hvdrnlvriTio 






esterase~pene id'MOR? 95 



650 

AC002550 

Homo ^aniens 

A I VJ Mil \J JUU I WHO 

I In known p^ene nrodnct 

858 

99 

651 

U26592 

Homo saniens 

diahetes mellitns tvne T antoanti^en 

UldUuLL'iJ IllVlllLUO I aULUUIlll^V/ll 

253 

66 

652 

X60155 

Homo saniens 

zinc flnper 41 

4349 

100 

653 

X53330 

Platvnereis 

H4 nrotein CAA 1 - 103> 

lit pi V7LCU1 Y_/V/\ 1 1 V/_J J 

523 

100 



diimprilii 

UUlllV/l 1111 




654 

AC003682 

Hnmn canipn c 

11UI11U odLliV/llo 

R9794S 9 

IVi. / 74J Z. 

Z.JJO 

100 

655 

X80473 

iV/Inc mn<;piiliic 
IVlUa lllUoCUlUo 

rah 10 

I dU 1 7 

S06 

S6 

6S6 

109640 

xxaLLllo 

uiiAJiowii pruicuj 

901 
ZU 1 

OS 

7J 



norvpoifii*; 

1IA/1 VGtil^LIo 




657 

AC006014 

Homo sanipns 

1 IVJlilU oapivllo 

Qimtlar tn RFP Iransfnrnnino nrntpin* 

oUHHal K\J S\JL L U allolul LllLLl^ pl^JlCiil, 

11^1 

00 

77 




similar to P14373 tTTD-el32517^ 



658 

X92972 

Homo saniens 

nrotein nhosnhatase 6 

1666 

100 

659 

L35269 

Hnmn saniens 

7inr* finopr nrntpin 

£.1111/ lllltlV^l |71ULV<11I 

2803 

09 

660 

Arnni6X9 

rxUIIlU bdpiCIlS 

r 1 OJ4 / 1 

J 1 o't 

Q6 

UU I 

X7Q904 

nomo bdpiciis 

didAin** i 

41 QS 

00 

77 


Y 17690 

nomo Sapiens 

iNinzj proiem 


00 

77 

661 

AR01 S61 7 

Homo sapiens 


1 <ioi 

CO 
oil 

004 

ZoOZoT 

Homo sapiens 

interferon regulatory factor 3 

23 J 1 

i nn 
1U0 

66 <\ 
ODD 


Pyrococcus 

t a r'rrivi /^i t tt* a tuiamc 
JLAC 1 UY LvjJL.U 1 A I HHJJNb 


TlO 



aoyssi 

t v a QP /nr 1 A A 1 K\ 






lv^pxHYT flT VOYAI A < sF^ 
1VJLE* 1 n X LUL I \JJ\u\L,J\dEi ) 












(G\ YOXAT ASF Ti 



666 

Z70200 

Homo sanipns 

I IS snRTsTP-snprifir 900VD nrotein 

J OllIXINr opc^iiiw Z.\J\JB±±S piCtV/JXl 

OO 1 7 

99 

77 

667 

770900 

JTlLIJJJw oapJCllb 

T TS cnTfTsIP-cnpriflr- 900VD nrntpin 

8SKQ 

OJ07 

07 

7 / 

668 

AF1 S14S0 

ivifiiiuuva icAid 

invi^nil** hnrmnnp Pcfprcicp hinHino 
JUVCUliC IIUIIIIUIIC CblCIooC UlilUUlg 

99S 

19 




nrntpin 

Ul LiLt/lll 



669 

AF227198 

Hnmn sanipns 

CrkRS 

7231 

99 

77 

670 


noino sapiens 

oJVi i proiem 

AA 1 

5/ 

671 

761 SRQ rdl 

W/MTirt ccjnipno 
nUUlU DapiCIli 

1 7_ AT TO_1 00R FlM A pnr»nr1ina n 


100 




Kiitmnn (~\C~*—1 nrnrpin 
llUillaii uv a piuiciii. 



67? 

AT1 19709 


/v i rd-dssociaieu laLior 


R8 
Oo 

671 

AF9041 SO 

nomo sapiens 

poiassiurn large conuuexance 

1450 

mo 
1 uu 




r»!i(r*iiim— Qotivsitprl pliann**! hptu 






cnrinnit 

OUUUlllt 



674 

G02061 

Hnmn carvipnc 
nuiiiu aapiciio 

Human cf^rptprl nrnt*»in QFO TFi 
nuiiiaii dcisicidi piuLCiiij OJU«y jlu 

J JO 

99 

77 




NO* 6149 



67 S 

H0 1946 

nuiiio oapiens 

Human ce*f*ve*tt*A r\vntt>\n QTJO II) 

nuiiidii bccrcicu pioiem, ony iu 

141 
i*t 1 

77 




WO- S397 



676 

AR016R10 

nomo sapiens 

rnobl 

/ll Q 

49 

677 

DR6Q70 

nomo sapiens 

oiTTTilof" +r\ mi/ncin noun/ r>riQin' 

similar to myosm ncavy cndirt- 

lOi 

9R 




Containinff ATP/GTP-bindin? site 






motif A(P-loop) 



678 

U83115 

Homo sapiens 

non-lens beta gamma-crystallin like 

8569 

99 




protein 



679 

AF203687 j 

Homo sapiens 

prolactin regulatory element-binding 

2181 

100 




protein 
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SEQ 
ID 

NO: 

ACCESSION 
NIIMRFR 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 
SCORE 

% 

IDENTITY 

680 

M27685 

Mus musculus 

ultra-high sulphur keratin 

650 

58 

681 

U04968 

Cricetulus 
griseus 

nucleotide excision repair protein 

3712 

97 

682 

AF1 19663 

Homo sapiens 

G-protein gamma- 12 subunit 

356 

100 

683 

G03733 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 7814. 

342 

100 

684 

X67699 

Homo sapiens 

CDw52 antigen 

297 

100 

685 

AF022789 

Homo sapiens 

ubiquitin hydrolyzing enzyme I 

1892 

100 

686 

AJ001006 

Mus musculus 

EMeg32 protein 

938 

96 

687 

W03516 

Homo sapiens 

Prostaglandin DP receptor. 

1864 

100 

688 

AF019661 

Mus musculus 

zeta proteasome chain; PSMA5 

1214 

100 

689 

AF1 56557 

Homo sapiens 

stomatin related protein 

2036 

100 

690 

G03960 

Homo sapiens 

Human secreted protein, SEQ ID 
NO: 8041. 

593 

100 

691 

AF161512 

Homo sapiens 

HSPC163 

738 

100 

692 

AL031115 

Homo sapiens 

ZXDA, ZXDB (zinc finger X-linked 
protein) 

4298 

100 

693 

L40410 

Homo sapiens 

thyroid receptor interactor 

806 

1 00 

i uv 

694 

AC004542 

Homo sapiens 

OX YSTEROL-B INDING 
PROTEIN-like; similar to P22059 
(PID:gl29308) 

2533 

99 

695 

AF16941 1 

Rattus 
norvegicus 

PAPIN 

4144 

52 

696 

Y58168 

Homo sapiens 

Human hydrolase homologue HHH- 
4. 

2144 

100 

697 

AF271994 

Homo sapiens 

dopamine responsive protein DRG-1 

1613 

100 

698 

Y41741 

Homo sapiens 

Human PRO704 nrotein spnupnre 


ion 

699 

AL133506 

Unknown 

/prediction=(method: ,M, genscan"", 
version:"" 1 .0"", score:"" 1 09. 1 3 ""); 
/prediction=(method: 

825 

48 

700 

Y96870 

Homo sapiens 

Human °oose-tvne Ivsozvme 
(GOLY). 

1032 

100 

i \J\J 

701 

AC003034 

Homo sapiens 

Gene with similarity to rat kidney- 
specific (KS) eene 

1190 

100 

702 

AC003034 

Homo sapiens 

Gene with similarity to rat kidney- 
specific (KS) gene 

937 

95 

703 

AJ242832 

Homo sapiens 

calpain 

3756 

100 

704 

S52624 

Homo sapiens 

unknown 

185 

100 

705. 

AF005081 

Homo sapiens 

skin-specific protein 

652 

100 

706 

Y16793 

Homo sapiens 

keratin, type 1 

2232 

100 

707 

Y44985 

Homo sapiens 

Human epidermal protein-2. 

455 

69 

708 

AF 113220 

Homo sapiens 

MSTP040 

686 

100 

709 

Y44985 

Homo sapiens 

Human epidermal protein-2. 

408 

65 

710 

YI6132 

Homo sapiens 

CDT6 

1874 

100 . 

711 

Y68775 

Homo sapiens 

Amino acid sequence of a human 
phosphorylation effector PHSP-7. 

2407 

100 

712 

X63422 

Homo sapiens 

H(+)-transporting ATP synthase 

209 

100 

713 

AF1 69968 

Mus musculus 

DNA binding protein DESRT 

1467 

79 

714 

X52563 

Bos taurus 

permability increasing protein 

383 

29 

715 

AJ277739 

Homo sapiens 

RPBllblalpha protein 

480 

98 

716 

AL135791 

Homo sapiens 

bA 1 62G 1 0.3 (zinc finger protein) 

401 

98 

717 

AF223466 

Homo sapiens 

HT0 15 protein 

1311 

97 

719 

AF1 17383 

Homo sapiens 

placental protein 13; PP13 

746 

100 

720 

Z98743 

Homo sapiens 

(1J181C9.2 (Rho GTPase activating 
protein 8 (RhoGAP, p50RhoGAP)) 

324 

100 

721 

AL163815 

Arabidopsis 
thaliana 

putative protein 

653 

61 

722 

GO 1436 

Homo sapiens 

Human secreted protein, SEQ ID 

418 

96 
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% 

ID 
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WA IE KM AN 

IDENTITY 

NO: 




SCORE 





NO: 5517. 



723 

AF282919 

Mus musculus 

Zfp228 

349 

49 

724 

AB023191 

Homo sapiens 

KIAA0974 protein 

2953 

100 

725 

AL031778 

Homo sapiens 

dJ34B21 1 (novel BZRP 

920 

inn 




(benzodiazapine receptor (peripheral) 






(MBR, PBR, PBKS, IBP, 






Isoquinoline-binding protein)) LIKE 






protein) 



726 

AL021939 

Homo sapiens 

dJ352A20.2 (aldehyde 

1764 

100 




dehydrogenase family protein) 



727 

AF 182426 

Rattus 

arylacetamide deacetylase 

791 

42 



norvegicus 




728 

Y08565 

Homo sapiens 

UDP-GalNAc:polypeptide N- 

3331 

99 




acetylgalactosaminyltransferase 



729 

AF155135 

Homo sapiens 

novel retinal pigment epithelial cell 

1652 

99 




protein 



730 

AL078606 

Arabidopsis 

putative protein 

211 

55 



thaliana 




731 

Y73352 

Homo sapiens 

HTRM clone 1732368 protein 

1720 

100 




sequence. 



732 

AF178432 

Homo sapiens 

SH3 protein 

3302 

100 

733 

Y17832 

Human 

env protein 

223 




endogenous 






retrovirus K 




734 

Y28859 

Homo sapiens 

Human mesoderm induction early 

2067 

70 




response protein ER1. 



735 

U09355 

Oryctolagus 

protein phosphatase 2 A 1 B gamma 

2352 

99 



cuniculus 

subunit 



736 

Y94922 

Homo sapiens 

Human secreted protein clone pv6 1 

724 

99 




protein sequence SEQ ID NO:50. 



737 

AB027003 

Mus musculus 

protein phosphatase 

378 

84 

738 

AF1 12200 

Homo sapiens 

NADH-oxidoreductase B18 subunit 

739 

100 

739 

AF1 12200 

Homo sapiens 

NADH-oxidoreductase B18 subunit 

613 

88 

740 

AF302154 

Homo sapiens 

SPG protein 

6556 

100 

741 

B25681 

Homo sapiens 

Human secreted protein sequence 

1410 

99 




encoded by gene 17 SEQ ID NO:70. 



742 

L27479 

Homo sapiens 

X123 

1237 

99 

743 

L27479 

Homo sapiens 

X123 

1206 

97 

744 

Y66745 

Homo sapiens 

Membrane-bound protein PRO 11 86. 

588 

99 

745 

AJ001019 

Homo sapiens 

ring finger protein 

1292 

99 

746 

X68453 

Sus scrofa 

tubulin-tyrosine ligase 

1882 

94 

747 

Y57897 

Homo sapiens 

Human transmembrane protein 

1173 

100 




HTMPN-21. 



748 

AF151069 

Homo sapiens 

HSPC235 

1694 

96 

749 

AF182404 

Homo sapiens 

mitochondrial uncoupling protein 1 

1674 

100 

750 

AL121993 

Homo sapiens 

d J776P7. 1 (Novel protein) 

2500 

99 

751 

AF149825 

Homo sapiens 

PACSIN3 

2253 

100 

752 

AL008635 

Homo sapiens 

dJ510H16.2 (high-mobility group 

3026 

99 




protein 2-like 1) 



753 

Y57914 

Homo sapiens 

Human transmembrane protein 

1124 

100 




HTMPN-38. 



754 

AF285109 

Homo sapiens 

septin 3 isoform B 

1766 

100 

755 

AF004161 

Oryctolagus 

peroxisomal Ca-dependent solute 

2371 

95 



cuniculus 

carrier 



756 

Z19585 

Homo sapiens 

thrombospondin-4 

4239 

100 

757 

AP001745 

Homo sapiens 

similar to zinc finger 5 protein 

1857 

100 

758 

AF190664 

Mus musculus 

LMBR2 

555 

72 

759 

AF090326 

Mus musculus 

AE-1 binding protein AEBP2 

1540 

97 

760 

AL096677 

Homo sapiens 

dJ322G 1 3.3 (novel protein similar to 

999 

94 
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% 

ID 

NIIIVI RFR 



WATERMAN 

IDENTITY 

NO: 




SCORE 





bovine and mouse beta-soluble NSF 






attachment protein (SNAP-beta) ) 



761 

AC003007 

Homo sapiens 

Unknown gene product (partial) 

649 

96 

762 

U66372 

Bos taurus 

ribosomal protein S29 

230 

7T 

764 

Y90899 

Homo sapiens 

Dl-like dopamine receptor activity 

1 152 

i no 

1 uu 




modifying protein SEQ ID NO: 1 . 



765 

U88169 . 

Caenorhabditis 

similar to molybdoterin biosynthesis 

1204 

65 



elegans 

MOEB proteins 



766 

ALII 8506 

Homo sapiens 

dJ591C20.3.1 (novel DnaJ domain 

1091 

100 




protein, similar to mouse and bovine 






cysteine string protein) 



767 

AK024693 

Homo sapiens 

unnamed protein product 

3767 

100 

768 

Zl 1518 

Homo sapiens 

histidyl-tRNA synthetase 

2582 

100 

769 

X13916 

Homo sapiens 

LDL-receptor related precursor (AA 

25529 

100 




-19 to 4525) 



770 

AC009360 

Arabidopsis 

Contains 3 PFJ00400 WD40, G-beta 

333 

13 
j j 



thaliana 

repeat domains. 



771 

AB037685 

Mus musculus 

LANP-like protein 

1246 

91 

772 

AL161578 

Arabidopsis 

putative protein 

335 

46 



thaliana 




773 

AL16I578 

Arabidopsis 

putative protein 

333 

47 



thaliana 




774 

AY008271 

Homo sapiens 

helicase SMARCAD1 

5264 

99 

775 

Y21591 

Homo sapiens 

Human secreted protein (clone 

1 127 

96 




CC332-33). 



776 

W88853 

Homo sapiens 

PolvoeDtide fragment encoded bv 

752 

i 




gene 89. 



777 

W88853 

Homo sapiens 

Polypeptide fragment encoded by 

752 

100 




gene 89. 



778 

W88853 

Homo sapiens 

Polypeptide fragment encoded by 

752 

100 




gene 89. 



779 

AF196481 

Homo sapiens 

RING finger protein; FXY2 

3644 

100 

780 

AL035427 

Homo sapiens 

dJ769N13.1 (KIAA0443 protein.) 

1609 

54 

781 

AB026I87 

Homo sapiens 

protocadherin-Xa 

5244 

100 

782 

B24458 

Homo sapiens 

Human secreted protein sequence 

1002 

100 




encoded by gene 22 SEQ ID NO: 83. 



783 

AB027289 

Homo sapiens 

cyclin-E binding protein 1 

5421 

100 

784 

G02916 

Homo sapiens 

Human secreted protein, SEQ ED 

627 

100 




NO: 6997. 



785 

AJ245822 

Homo sapiens 

type I transmembrane receptor 

4560 

100 

786 

AJ245820 

Homo sapiens 

type I transmembrane receptor 

4624 

100 

787 

Z48042 

Homo sapiens 

GPI-anchored protein pi 37 

3340 

99 

788 

AL031782 

Homo sapiens 

dJ708F5.1 (PUTATIVE novel 

2739 

100 




Collagen alpha 1 LIKE protein) 



789 

AJ131245 

Homo sapiens 

Sec24B protein 

6602 

100 

790 

AF 107203 

Homo sapiens 

ataxin 2-binding protein 

2008 

100 

791 

Y 14690 

Homo sapiens 

procollagen alpha 2(V) 

600 

34 

792 

AL031055 

Homo sapiens 

dJ28H20.2 (novel protein) 

1267 

100 

793 

Y36194 

787 

Human secreted protein 

2051 

99 

794 

AB028127 

Homo sapiens 

mannosyltransferase 

2138 

96 

795 

AC007228 

Homo sapiens 

R31665 2 

2738 

79 

796 

AL049482 

Arabidopsis 

putative protein 

436 

47 



thaliana 




797 

AC004528 

Homo sapiens 

R32184J 

891 

91 

798 

AB037830 

Homo sapiens 

K1AA1409 protein 

7532 

100 

799 

X53793 

Homo sapiens 

5' half of the product is homologues 

2232 

100 




to Bacillus subtiis SAICAR 






synthetase, 3' half corresponds to the 






catalytic subunit of AIR carboxylase 
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PCT/US01/04098 


SEQ 
ID 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 
SCOKfe. 

% 

IDENTITY 

800 

Y99350 

Homo sapiens 

Human PR0137S (UMQ715) amino 
dcia sequence otv^ ii>» invj.jj. 

1343 

100 

801 


Hnm r\ campnc 
riwlliu oaplCllb 

juiiciupinuii typej 

199S 

A "7 
** / 

802 

AB029324 

Rattus 

n orvpon/M 1 c 

TIP120-family protein TIP 120B 

3916 

90 

803 

AB029324 

Rattus 

llUi YCJ* lUllo 

TIP120-family protein TIP120B 

4961 

90 

804 

AF251040 

Homo sapiens 

putative nuclear protein 

2119 

100 

RAS 


Homo sapiens 

F-box and WD-repeats protein beta- 
TRCP2 isoform C 

z5/y 

100 


T IR7^fK 

UO /jUJ 

T? nt+iic 
JYdllUS 

norvegicus 

transmembrane receptor UNC5H 1 

19^'7 
jZj i 

OA 

91) 

807 

AF1 1 RRRO 

rVr 1 1 0007 

r\.auus 
norvegicus 

b-tomosyn isoform 

31 JJ 

0*7 

fine 
ouo 

AF99£QQ'* 

rvauUS 

norvegicus 

selective LIM binding factor 

o/yj 

95 

RAO 

W lyy J y 

Homo sapiens 

Human Ksr-1 (kinase suppressor of 
Kasj. 

39J9 

99 

Rl A 

/\JLfU J 1 / OZ 

— — : 

Homo sapiens 

GJ/UorD.l v^rU I A 11 vc, novel 
v^onagen aipna l jli jvc proiemy 


1 00 
IUO 

81 1 

Oil 

A P009 S49 

norrio sapiens 

cimilar tr\ C* f*\p>nanc F1 1 A 1ft C* OftO/ 

similar 10 eiegans r i i/\iu.j, ou/o 


1 AA 
1UU 

812 

U83246 

Homo sapiens 

copine I 

606 

52 

O IJ 

AF949*;S9 

uduus gaiius 

retinovin 


34 

R14 

O 1** 


Homo sapiens 

zinc finger protein 10 

lOJ 1 

y3 

815 

X52332 

Homo sapiens 

zinc finger protein 10 

2423 

99 

816 

Y09631 

Homo sapiens 

PIBF1 protein 

2935 

99 

51 / 

Yi 1 007 
X/ly97 

Rattus 
norvegicus 

myosin I 

3883 

98 

818 

AY004877 

Mus musculus 

cytoplasmic dynein heavy chain ' 

11105 

98 

Q1Q 

oly 

VOTl 0£ 

1/ / lyo 

Homo sapiens 

Human cyclic nucleotide 
phosphodiester PDE8B(E) amino 
acid sequence. 

3790 

100 

R90 

OZU 

A "FAR 1 0/17 
/\r V/o 1 y** 1 

__- 

Mus musculus 

tektin 

1 134 

0 1 

6l 

R91 

i AT A3<.1A£ 
/\1AJ0 D IUO 

Homo sapiens 

ujyyod i.i (continues in 
Em:AL445192 as bA269H4.1) 

0*7 1 

1 AA 
1UU 

R99 

A FA997Q<\ 

Homo sapiens 

TGF beta receptor associated protein- 

i 
i 

3oj 

Z4 

823 

AF01S77A 

lvius museums 

rauicdi liinge 

1 /197 

R9 
oZ 

824 

TTR960S 

T-JfMTl/"» canipnc 

jtiajiiiij sapiens 

expresseu-A.t{Zoo i o proieui 

i ****** 

yy 

825 

X7737 1 

N^PQf*f*ri f*pti 1 c 

auratus 

POT? 1 

O** 1 

7R 

826 

AB014S76 

XXUlllLf SdpiCIlo 

VTA A C\(\1f\ nrofpin 

90£ 

70 

iy 

827 

AT 049733 

Hftmn canipnc 
rxvjiiivi csapiciio 

HTR7SH^ 1 f*APK1 antitrpn^ 
ujo / jnj.i ^Arrvi aiiiigGiiy 

1 SR4 

79 

828 

AF222980 

Homo sapiens 

disrupted in Schizophrenia 1 protein 

4418 

100 

890 


nunio sapiens 

sox-2 

1 £R^ 
10o3 

1 AA 


AF9QS771 

riuiiio sapiens 

rai guanine nucieonue aissociauon 

SLIIIIUiaiUI 

** / 1 / 

OQ 

831 


l-fr»ni r\ conipnc 
•TAv/niV/ oapiciio 

VJv^JV lalllliy JVUldSC IVIllNrv.-^ 

OoOO 

inn 

IUU 

832 

T 0494R 

oauLrtidromyt/e 

c pprpviciap 

miiocnonunai transporter proiein 

^^R 
335 

33 

833 


N/fuc muc/MiItic 
lviuo lllUSCuiUS 

Fior» nrofpjn 

Fisn pruieui 

7A4 
/U** 

y** 

834 

Z34289 

Homo sapiens 

nucleolar phosphoprotein pi 30 

3455 

99 

835 

U10991 

Homn Qanipn*! 


(rtJU 

OR 
yo 

836 

AF230877 

Homo sapiens 

MIP-T3 

2945 

99 

837 

X58288 

Homo sapiens 

protein-tyrosine phosphatase 

7734 

99 

838 

X56958 

Homo sapiens 

ankyrin (brank-2) 

9631 

100 

839 

AC024791 

Caenorhabditis 
elegans 

contains similarity to beta-lactamases 

370 

24 
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SEQ 
ID 
NO: 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

I SMITH- 
WATERMAN 
SCORE 

% 

IDENTITY 

840 

D83I97 

Homo sapiens 

ankyrin repeat protein 

807 

QO 

841 

AF05371 1 

Serin us 
canaria 

neurofilament medium subnnit 

too 


842 

AF283772 

Homo sapiens 

similar to Homo sapiens ribosomal 
protein L10 encoded by GenBank 
Accession Number L25899 

; 990 

96 

843 

U76343 

Homo sapiens 

GABA transport protein 

2992 


844 

YI3645 

Homo sapiens 

uroplakin II 

897 

i no 

l uu 

845 

D21064 

Homo sapiens 

similar to rat general mitochondria] 
matrix processing protease mRNA 
(RATMPP). 

2710 

00 

846 

AF192522 

Homo sapiens 

Niemann-Pick C3 protein; NPC3 

7047 

100 

847 

AF 192522 

Homo sapiens 

Niemann-Pick C3 protein; NPC3 

5472 

100 

848 

X60489 

Homo sapiens 

elongation factor- 1 -beta 

1162 

100 

849 

AC007204 

Homo sapiens 

BC273239 1 

2277 

67 

u / 

850 

AC003682 

Homo sapiens 

R28830 1 

2401 

1 UU 

851 

AL121583 

Homo sapiens 

bA358N2.1 (novel protein) 

353 

61 

852 

Z48475 

Homo sapiens 

fflucokinase regulator 

11SS 
j i jj 

00 

yy 

853 

Z83844 

Homo sapiens 

dJ37E16 2 CSH3-domain bindino 
protein 1) 


OR 

854 

AF233323 

Homo sapiens 

Fas-associated phosphatase- 1 

390 

36 

855 

AF062741 

Rattus 
norvegicus 

nvnivatp HphvHrn&pnacp r»hncr\1i«at;»cA 

JJJflUVClLC UCIJJ'Vll UgClldoC JJIIWbJJl JdldbC 

isopn7A/mp 9 

447 

CO 

5U 

856 

Yl 141 1 

Homo sapiens 

nn^tanovl-r^n A nvidacp 

JJ7J 

yo 

857 

M97188 

Stron f*v locen tr 
otus 

purpuratus 

tektin A 1 

900 

4£ 

858 

AB001105 

Homo sapiens 

hiooocalcin-lilce nrotpin 4 


100 

859 

AF 1 64791 

Homo sapiens 

putative 38.3kDa protein 

1795 

100 
1 uu 

860 

AF298117 

Homo sapiens 

homeobox nrotein OTDC9 

1477 

7J 

861 

AF0I5264 

Rattus 
norvegicus 

golgi peripheral membrane protein 
p65 

1820 

81 

862 

X16901 

Homo sapiens 

30kb subunit of RAFttO /74 

19R4 

100 

1UU 

863 

M12140 

Homo sapiens 

envelope protein 

202 

81 

864 

AF161459 

Homo saniens 

HSPC109 

Rl S 

OlJ 

QR 

yo 

865 

AL109983 

Homo saDiens 

dJ718Pll 1 1 (novel class II 

UJ / 1 OA I 1,1.1 ^UUVtl Ulddd JUL 

aminotransferase similar to serine 
palmotyltransferase (isoform 1)) 

444 

100 

IUU 

866 

M77183 

Rattus 
norvegicus 

alpha- 1 -macroglobulin 

227 

45 

867 

AF272663 

Homo sapiens 

gephyrin 

3785 

100 

868 

X75285 

Mus musculus 

fibulin-2 

3258 

87 

869 

X82494 

Homo sapiens 

fibulin-2 

3407 

99 

870 

AJ297743 

Mus musculus 

torsinB protein 

169 

43 

871 

AJ278313 

Homo sapiens 

phospholipase C-beta-la 

6258 

99 

872 

AF073344 

Homo sapiens 

ubiquitin -specific protease 3 

256 


873 

Y91955 

Homo sapiens 

Human cytoskeleton associated 
protein 1 0 (C YSKP- 1 0). 

535 

100 

874 

AJ000414 

Homo sapiens 

Cdc42-interactin? nrotein 4 

1 H6 

1 1 JU 

JJ 

875 

AF265555 

Homo sapiens 

ubiquitin-conjugating BIR-domain 
enzvme APOLLON 

627 

100 

876 

Y48586 

Homo sapiens 

Human brpast tumoiir-assoriatpH 
protein 47. 

/.jj / 

yo 

877 

AF182198 

Homo sapiens 

intersectin 2 long isoform 

8764 

99 

878 

L17308 

Gossypium 
hirsutum 

proline-rich cell wall protein 

192 

35 

879 

AF177169 

Homo sapiens 

tropomodulin 2 

1769 

100 

880 

W03627 

Homo sapiens 

Human follicle stimulating hormone 
GPR N-terminal sequence. 

210 

23 


148 


wo 


01/57190 


PCT/US01/04098 


SEQ 
ID 

ACCESSION 
NUMBER 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 
SCORE 

r % 
IDENTITY 

881 

AL021068 

Homo sapiens 

dJ206D15.3 

2615 

99 

RR7 

AP00S4QR 

numo Sapiens 

K J 1 OO J Z 

1 1 R 
Jlo 

oZ 

RR1 

AF16SS1 R 

nomo Sapiens 

ivi/vovjii lsoiorm 

1 R7 
1 oZ 


R R4 

D7 1911 

numo Sapiens 

proiem lyrosme pnospnatase \ r i v- 
BAS, type 3) 

T£R 
JOO 


RRS 

i n^ods 

U 1 JuHJ 

numo sapiens 

nuciear respiraiory iacior-z suouniL 

Odd. 1 

RAQ 
0O7 

oZ 

886 

X52836 

Homo sapiens 

tryptophan hydroxylase (AA 1 - 444) 

2320 

98 

oo / 


numo sapiens 

elongation laccor z 

44 AO 

inn l 

888 

AB039903 

Homo sapiens 

interferon-responsive finger protein 1 
long lorm 

1096 

98 


v< i 7£A 
AJ 1 / ou 

__ - 

Homo sapiens 

zinc finger protein (583 A A) 


1 Clf\ 

oyu 

AT743TQ6 

nomo sapiens 

voiiage-gateu soaium c nan net oetao 
suounu 

in7/i 

lUZ^f 

inA 
IUU 

891 

W67928 

Homo sapiens 

Fragment of human secreted protein 
encoded by gene 4. 

391 

100 

RQ7 

07Z 

/vdUZUjto 

Homo sapiens 

peptide transporter 3 

3U17 

1 AA 


I OOOHo 

Homo sapiens 

Membrane-bound protein PROl 120. 

4/ZZ 

yy 

RQ4 

I OOOHO 

Homo sapiens 

Membrane-bound protein PROl 120. 

*2£A£ 

J0U0 

yo 

RQ^ 

AZ7Z I o CO 
1 

Homo sapiens 

ly-N\Jv-iyya UN A encoding O- 

protein coupled 7 TM receptor with 
AJvUKi j activiry. 

TITO 

Zi /o 

1 AA 
100 

RQ£ 


— — ■ 

Homo sapiens 

Glucosidase II 

^o/;o 

OO 

yy 

R07 

yq»7<q 

nomo sapiens 

M-phase phosphoprotein 8 

1AQC 

1 AA 

898 

X57110 

Homo sapiens 

c-cbl protein 

4849 

99 


AoiojZ 

Homo sapiens 

inter-alpha-trypsin inhibitor heavy 
cnam 1 1 ihi 

3376 

98 

onn 

7UU 


— — ; 

Homo sapiens 

RJB protein binding protein 

ZdlO 

AA 

yy 

Q01 

1 1 1 6.71 

Homo sapiens 

zinc finger protein 

OA/in 
ZU4/ 

5o 



Homo sapiens 

Human homologue of UNC-53 (Hs- 
\jis*~>-jjfz.j sequence. 

Joy 


903 


HnmA CQnipnc 
riULLHJ ba\Jl<Pllb 

lab iClaieCI piUieill I\aUJU 

1 004 

inn 

IUU 

904 


\-Jf\vnf\ c*jr\i^Tic 
numo bdUieilb 

piaK.opnnm j 

4AA^ 
*400j 

1 nn 

7V J 

AT 0TS7QS 

T-f nnn f\ com f± n c 
nuiiiu adpiCIla 

iiypoLiieiieai proiem 

7J7 

00 

77 


AF0S17R7 

numo oapiciio 

UlapildnOUo 1 

roi 


907 

AF70R < Hfi 

J1V7111U aapiciio 

nuuicoiiue umumg procein, indf 

1*3.77 
1J> /z 

1 00 

IUU 

908 

U79240 

Homo sapiens 

serine/threonine protein kinase 

2365 

98 

Q0Q 

"v7 

T 770740 

nomo sapiens 

senne/uueonine proiem Kinase 

ZjOO 

00 

77 

910 

AT1T>S4^ 

nomo sapiens 

proiem Kmase 

OQ7 1 
Z7ZI 

inn 

91 1 

711 

ATI ^7 S4S 

Hrtmrt con t c^r\ 0 

nomo sapiens 

proiem Kmase 

1 fy"X1 

IOJ / 

QQ 

77 

912 

AT WWI'X 

IHJIIIU bdUlClJi> 

iiypoinencdi proiem 


00 

77 

9n 

Y67S7Q 

nouio bdpieiib 

nujiid.ii ucdin lnQuoer-ODiiierdior i 
(DIO- 1) polypeptide. 

1 SR£ 

100 
IUU 

914 

YR7^47 

AO / J'tX 

nOlilO bapiCJlD 

nuiiJdn gidiii idrvae nomoiogue 

^^17 
JJ>1 / 

QQ 

77 

91S 

YR7^47 

AO /JtZ. 

T-Tr»mr* c^niAnc 

numo bdpiens 

nunidn giani larvae nomoiogue 

^40^ 
jHyj 

70 

916 

7 IU 


T-Taitia cdni^nc 
nvjinu bapiciib 

lomtn TV) 

7^S7 
ZJ> J / 

Q^ 

yj 

917 

AJ01 16S4 

Wnflrl A poni An c 

nuiiiu bapiciio 

uipic ijiivi uomdui proiem 

^4^7 
J*f jZ 

100 
IUU 

QIC 

7 1 O 

A T1 ^ 1 R99 

icVJ 1 j 1 oyy 

lYalLUo 

norvegicus 

proime ncn synapse associaiea 
protein 1 

^77A 

RR 
00 

919 

717 

AF0S49R6 

nomo bdpienb 

puiaLive udnsmem Diane \jirase 

1 R 1£ 
1510 

1 nn 

IUU 

990 

TT0^R77 

nomo sapiens 

putative transmembrane GTPase 

1ZJ / 

1 ho 
IUU 

921 

Y11588 

Homo sapiens 

apoptosis specific protein 

1492 

100 

922 

X84195 

Hatha csnipn? 

opvl r\1i r»c nn iit 5> 

^10 

100 

IUU 

923 

U72882 

Homo sapiens 

interferon-induced leucine zipper 
protein 

1409 

99 

924 

AE000660 

Homo sapiens 

hADV36Sl 

573 

100 

925 

AF126245 

Homo sapiens 

acyl-Coenzyme A dehydrogenase-8 
precursor 

2162 

100 
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SEQ 
ID 

NO: 

ACCESSION 

SPECIES 

DESCRIPTION 

SMITH- 
WATERMAN 

% 

IDENTITY 

926 

AE001968 

Deinococcus 
radiodurans 

hypothetical protein 

147 

11 

927 

W81576 

Homo sapiens 

EBV-induced G-protein coupled 
receptor (EBI-2) polypeptide. 

1778 

100 

928 

U01317 

Homo sapiens 

beta-globin 

687 

94 

929 

X98333 

Homo sapiens 

organic cation transporter 

2933 

100 

930 

Y91444 

Homo sapiens 

Human secreted protein sequence 
encoded by gene 42 SEQ ID 
NO: 165. 

1401 

i fin 

931 

Y91644 

Homo sapiens 

Human secreted protein sequence 
encoded by gene 43 SEQ ID 
NO:3I7. 

1243 

100 

932 

D90279 

Homo sapiens 

collagen alpha 1(V) chain precursor 

569 

39 

933 

Z31560 

Homo sapiens 

sox-2 

1587 

96 

934 

AF147790 

. Homo sapiens 

transmembrane mucin 12 

3047 

99 

935 

Z85996 

Homo sapiens 

match: multiple proteins; match: 
Q08151 P28185 Q01111 Q43554; 
match: Q08150 Q40195 P20340 
Q39222; match: Q40368 P36412 
P40393 Q40723; match: CE01798 
Q38923 Q40191 Q41022; match: 
Q39433 Q40177 Q40218 Q08146; 
matrh* P10Q4Q PI 10?^ Hl^OdS 
020337* match* 025389 
P20336 P05713* match* P35276 
Q08147 P17609 P22128; match: 
Q15771 P36410P3529li GTP- 
binding 

726 

94 

936 

AB04I533 

Homo sapiens 

sperm antigen 

1054 

JO 

937 

X91906 

Homo sapiens 

voltage-gated chloride ion channel 

3914 

100 

938 

AB032481 

Homo sapiens 

homeobox transcription factor 

1744 

100 

939 

AF11U06 

Homo sapiens 

protein serine/threonine phosphatase 
4 regulatory subunit 1 

4682 

99 

940 

Y 17999 

Homo sapiens j 

DyrklB protein kinase 

3331 

99 

941 

AF305872 

Homo sapiens 

thyroglobulin 

455 

92 

942 

AF263462 

Homo sapiens 

cingulin 

5939 

99 

943 

AK024442 

Homo sapiens 

FLJ00032 protein 

1616 

61 

944 

Y35911 

Homo sapiens 

Extended human secreted protein 
sequence, SEQ ID NO. 160. 

262 


945 

ABO 15320 

Homo sapiens 

sigma IB subunit of AP-1 clathrin 
adaptor complex 

599 

71 

946 

Z82287 

Caenorhabditis 
elegans 

ZK550.2 

229 

35 

947 

D84223 

Homo sapiens 

leucyl tRNA synthetase 

6207 

99 

948 

U49057 

Rattus 
norvegicus 

rA9 

3846 

62 

949 

AK000568 

Homo sapiens 

unnamed protein product 

1659 

100 

950 

AL021578 

Homo sapiens 

dJ453C 12.6.1 (uncharacterized 
hypothalamus protein (isoform I)) 

257 

4? 

951 

AB032435 

Homo sapiens 

differentiation-associated Na- 
dependent inorganic phosphate 
cotransporter 

3063 

99 

952 

AF110532 

Homo sapiens 

uncoupling protein UCP-4 

1561 

100 

953 

X83587 

Mus musculus 

1A13 protein 

1420 

59 

954 

AL031665 

Homo sapiens 

dJ545L17.5.1 (novel protein) 

386 

53 

955 

Y8760O 

Homo sapiens 

Human fatty acid synthase-like 
protein (HFASLP). 

2377 

100 

956 

Y99421 

Homo sapiens 

Human PR01433 (UNQ738) amino 
acid sequence SEQ ID NO:292. 

522 

55 


150 


WO 01/57190 


PCT/USO 1/04098 


SEQ 

ACCESSION 

SPECIES 

DESCRIPTION 

SMITH- 

% 

ID 

NUMBER 



WATERMAN 

JDENT1TY 

NO: 




SCORE 


957 

U68535 

Mus musculus 

aldo-keto reductase 

451 

73 

958 

AC007067 

Arabidopsis 

Tl 0024.10 

1594 

57 



thaliana 




959 

U72194 

Mus musculus 

muskelin 

3947 

99 

960 

AEO03661 

Drosophila 

CG15168 gene product 

277 

54 



melanogaster 




961 

X80332 

Mus musculus 

rab20 ! 

983 

82 

962 

Y67315 

Homo sapiens 

Human secreted protein BL 8 9 13 

3916 

99 




amino acid sequence. 



963 

Y67315 

Homo sapiens 

Human secreted protein BL89 1 3 

3916 

99 




amino acid sequence. 



964 

L32602 

Rattus 

homeodomain 159.341 

1821 

96 



norvegicus 




965 

Z97832 

Homo sapiens 

dJ329A5.3 (BQAA06460 protein) 

3581 

99 

966 

W88995 

Homo sapiens 

Polypeptide fragment encoded by 

176 

39 




gene 146. 



967 

U 12465 

Homo sapiens 

ribosomal protein L35 

604 

100 

968 

AF151803 

Homo sapiens 

CGI-45 protein 

1101 

78 

969 

W74865 

Homo saoiens 

Human secreted protein encoded by 

1348 

98 




gene 137 clone HMWIF35 



970 

L21936 

Homo santens 

siircinatf* H^hvdroppnase flavoorofein 

703 

100 




siihunit 

JUUUllll 



971 

AJ133521 

Drosonhila 

VJ O 1 1 1 ICl 

nroteasf* rpverse trail snriotase 

194 

23 



buzzatii 

ribonuclease H inteerase 



972 

AC006017 

Homo saoiens 

N-acetylgalactosaminyltransferase; 

3271 

100 




similar to 010473 (P\D p 17095591 



973 

Z81317 

Schi70sacchar 

DNA2-NAM7 helicase familv 

685 

31 



omyces pombe 

protein 



974 

M17885 

Homo saoiens 

acidic rihosomal Dhosohoorotein ^P01 

792 

100 

975 

U22829 

Mus miiscultis 

P9 Y nurinoceotor 

399 

40 

976 

AL 132772 

Homo saoiens 

dJ1013A22.1 (hepatic nuclear factor 

2466 

99 




4, alpha) 



977 

AC003973 

Homo saoiens 

ZNF91L 

1550 

43 

978 

J04031 

Homo sapiens 

MDMCSF (EC 1.5.1.5; EC 3.5.4.9; 

2824 

63 




EC 6.3.4.3) 



979 

AF136715 

Homo sapiens 

taxol resistant associated protein 

217 

76 

980 

AF136715 

Homo sapiens 

taxol resistant associated protein 

306 

95 

981 

Z92822 

*_j y yl#4< 

Capnorhahditis 

ZK520 1 

1 109 

44 



eiegans 




982 

AJ295149 

Homo sapiens 

putative dipeptidase 

1564 

99 

983 

AL021331 

Homo sapiens 

<1J366N23.3 (KIAA0173 and 

1492 

300 




Tubulin-Tyrosine Ligase LIKE) 



984 

AL161501 

Arabidopsis 

putative adenosine deaminase 

370 

38 



thaliana 





TABLE 3 


SEQ 
ID 
NO: 

ACCESSION 
NO. 

DESCRIPTION 

RESULTS* 

2 

BL00282 

Kazal serine protease inhibitors family 
proteins. 

BL00282 16.88 4.259e-14 97-120 

3 

BL00298 

Heat shock hsp90 proteins family 
proteins. 

BL00298A 10.97 1.000e-40 74- 
119 BL00298E 27.30 1.000e-40 
321-376 BL00298F 1121 l.OOOe- 
40 409-464 BL00298H 20.50 
1 .000e-40 553-607 BL00298C 
16.40 2.286e-40 186-230 
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SEQ 
H> 

NO: 

ACCESSION 
NO. 

DESCRIPTION 

RESULTS* 




BL00298B 15.64 1.290e-39 134- 

i o i r~\ y a ao nor i/i ct f i/ic ^/\ 

I o I dLUU298u 24.57 5345e-39 
465-520 BL00298I 30.07 7.8 18e- 
34 66 1-7 15 BL00298D 17.97 

4 

PR00237 

RHODOPSrN-LIKE GPCR 
oUPbKr AM1LY MGNAlUKb 

PR00237A 11.48 4.3 16e- 13 57-82 

5 

PD02454 

!!!! PROTEIN ALU SUBFAMILY 
WARNING ENTRY NUCLEAR 
PHOSPHO. 

PD02454B 11.61 4.309e-17 75- 
103 

r 
O 

LJMU0864 

EGF-LIKE DOMAIN. 

TWAf\f\OHA A 1 C O 1 1 A*\C\~ AA AO 

DM0U864A 15.21 7.429e-09 98- 

1 1 o 

7 

PR00237 

RHODOPSFN-LIKE GPCR 
SUPERFAMILY SIGNATURE 

PR00237A 11.48 1.750e-l 1 29-54 
PR00237D 8.94 7.000e-09 138- 

1 £LC\ ni)flf\T}TD n CA O AA 

lol) FKUUzi/r) 13.5U 5.2MJe-Uy 
61-83 

9 

PF00855 

PWWP domain proteins. 

PF00855 13.75 5.667e-15 272-289 

10 

BL00139 

Eukaryotic thiol (cysteine) proteases 
cysteine proteins. 

BL00139D 9.24 4.400e-l 1 391- 

A AO T~l T f\f\1 TA A 1A1ATC11— AA 

408 BL00139A 10.29 7.51 le-09 
67-77 

12 

BL01113 

CI q domain proteins. 

BL01113B 18.26 9.294e-19 689- 
725 BL01113C 13.184.857e-l 1 
757-777 BL0 1 1 1 3D 7.47 2. 1 6 1 e- 
10 790-800 

13 

BL01113 

Clq domain proteins. 

BL01113B 18.26 3.813e-14 599- 
635 BL01113C 13.18 4.857e-l 1 
667-687 BL01 1 13D 7.47 2.161e- 
10 700-710 

14 

BL00594 

Aromatic amino acids permeases 
proteins. 

BL00594A 16.75 6.53 le- 10 50-94 

15 

BL01047 

Heavy-metal-associated domain proteins. 

BL01047B 19.73 4.913e-13 707- 

TOO 

728 

16 

PR00625 

DNAJ PROTEIN FAMILY 
SIGNATURE 

PR00625A 12.84 7.462e-18 310- 
330 PR00625B 13.48 3.939e-15 
340-361 

18 

BL00615 

C-type lectin domain proteins. 

BL00615A 16.68 3.700e-09 144- 
162 

20 

PR00741 

GLYCOSYL HYDROLASE FAMILY 
29 SIGNATURE 

PR00741D 16.11 9.082e-21 175- 
195 PR00741F 14.66 9.262e-21 
243-265 PR00741B 14.23 1.947e- 
18 128-145 PR00741G9.29 

o 1 OA- n o i o n a n titj aat/i 1 c • 

2.180e-17 318-340 PR00741C 
9.16 7.328e-17 147-166 
PR00741H 10.32 2.141e-13 351- 

11 A DD AAT/t f A A O/t 0 CO^ a 1 1 

J/4 PKUU/4IA 9.24 i.jyoe-1 J 

89-105 PR00741E 13.39 3.535e- 

12 21 J-2J2 

22 

BL00107 

Protein kinases ATP-binding region 
proteins. 

BL00107A 18.39 3.647e-20 1 17- 

1/1Q r>T AA1AOI3 n Ol 1 AAAa I A. 

182-198 

! 15 

dt (\{\ i m 
DJL.UU 1 U / 

Protein kinases ATP-binding region 
proteins. 

DT AA1AOA tQ 1Q 1 £L(\(\a. OO lO/C 

oLUUlU/A io.jy l.oUUe-23 120- 

157 

24 

T5T AA 1 AO 

BLU01U7 

Protein kinases ATP-binding region 

"DT A A 1 AO A 1© OA 1 /CAft„ OO 1 O/C 

DL00107A lo.iy 1.6UUe-2J 120- 

27 

BL00239 

Receptor tyrosine kinase class II proteins. 

BL00239B 25.15 2.324e- 16 91- 
139 

28 

BL00018 

EF-hand calcium-binding domain 
proteins. 

BL00018 7.41 3.250e-10 681-694 
BL00018 7.41 6.400e-10 717-730 

29 

BL00018 

EF-hand calcium-binding domain 

BL00018 7.41 3.250e-10 681-694 
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SEQ 
ID 
NO: 
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NO. 

DESCRIPTION 
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proteins. 

BL00018 7.41 6.400e- 10 717-730 

30 

BL0I113 

Clq domain proteins. 

BL01 1 13 A 17.99 9.308e-09 54-81 

33 

PD01168 

SYNTHETASE LIGASE PROTEIN 
ALANYL. 

PD01168L 9.47 1.667e-09 401- 
416 

34 

PDO 11 68 

SYNTHETASE LIGASE PROTEIN 
ALANYL. 

PDO 1 1 68L 9.47 1 .667e-09 411- 
426 

36 

PR00426 

C5A-ANAPHYLATOXIN RECEPTOR 
SIGNATURE 

PR00426D 10.59 3.618e-l2 110- 

122 

37 

PF00791 

Domain present in ZO- 1 and Unc5-Iike 
netrin receptors. 

PF00791B 28.49 2.049e-10 1080- 
1135 

38 

BL00350 

MADS-box domain proteins. 

BLOO350 20.79 1.000e-40 1-55 

40 

BL00123 

Alkaline phosphatase proteins. 

BL00123B 19.31 LOOOe-40 90- 
133 BL00123C 24.61 1.000e-40 
145-195 BL00123E 22.25 l.OOOe- 
40 304-358 BL0O123G 26.01 
1.000e-40 438-488 BL00123F 
19.03 8.7 14e-35 364-399 
BL00123A 10.80 9.000e-24 52-77 
BL00123D 12.73 L000e-17 216- 
229 

44 

PD00066 

PROTEIN ZINC-FINGER METAL- 
BINDL 

PD00066 13.92 2.800e-14 346-359 
PD00066 13.92 4.600e-14 486-499 
PD00066 13.92 1.000e-13 374-387 
PD00066 1 3 92 6 DOOp-1 ^ 4SK-471 
PD00066 13.92 2.714e-12 234-247 
PD00066 13 92 3 143e- 12 430-443 
PD00066 13.92 8.714e-12 514-527 
PD00066 13.92 3.739e-l 1402-415 
PD00066 13.92 2.038e-10 3 18-331 

45 

DM00973 

3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 

DM00973 A 21.17 2.946e-10 180- 
217 

47 

BL00649 

G-protein coupled receptors family 2 
proteins. 

BL00649C 17.82 1.682e-10 475- 
501 BL00649B 20.68 7.387e-09 
417-463 

50 

PD00066 

PROTEIN ZINC-FINGER METAL- 
BINDI. 

PD00066 13.92 8.200e-l 6 445-458 
PD00066 13.92 5.846e-15 305-318 
PD00066 13.92 l.OOOe- 14 221-234 
PD00066 13.92 l.OOOe- 14 417-430 
PD00066 13 92 2 800e-I4 249-262 
PD00066 13.92 2.800e-14 277-290 
PD00066 13.92 8.800e- 14 333-346 
PD00066 13.92 9.400e-14 361-374 
PD00066 1 3.92 4.000e-13 389-402 
PD00066 13.92 6.571e-12 473-486 

51 

BL00226 

Intermediate filaments proteins. 

BL00226D 19.10 LOOOe-40417- 
464 BL00226B 23.86 3.348e-35 
251-299 BL00226C 13.23 1.429e- 
24 316-347 BL00226A 12.77 
1.857e-15 151-166 

52 

PR00217 

43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 

PR00217C 10.91 5.648e-09 133- 
149 

53 

BL00232 

Cadherins extracellular repeat proteins 
domain proteins. 

BL00232B 32.79 1.000e-40 143- 
191 BL00232A 27 72 2 3*50^-28 
49-82 BL00232B 32.79 7.052e-21 
252-300 BL00232C 10.65 6.625e- 
20 250-268 BL00232B 32.79 
1.314e-ll 367-415 BL00232C 
10.65 9.308e- 10 470-488 

54 

BL00303 

S-100/ICaBP type calcium binding 

BL00303B 26.15 8.759e-23 125- 
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protein. 

162 BL00303A 21.77 I.000e-21 
82-1 19 

58 

PR00378 

INOSITOL PHOSPHATASE 
SIGNATURE 

PR00378D 16.86 1.000e-15 242- 
109-129 

59 

PR00425 

BRADYKININ RECEPTOR 
SIGNATURE 

PR00425C 13.23 9.040e-12 120- 
140 

60 

BL00280 

Pancreatic trypsin inhibitor (Kunitz) 
family proteins. 

BL00280 24.61 6.727e-3 8 238-282 
BL00280 24.61 1. 5 14e-30 294-338 

65 

BL01019 

ADP-ribosylation factors family proteins. 

BL01019A 13.20 1.222e-l 1 43-83 

68 

PR00237 

RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 

PR00237E 13.03 5.091e-13 188- 
212 PR00237G 19.63 7.207e-13 
268-295 PR00237A 11.48 4.375e- 
11 Z4-4y JrKUUzi/U \j.oy 
3.057e-10 101-124 PR00237D 

PR00237F 13.57 5.364e-l 0230- 

57-79 

70 

PD01066 

PROTFTN 7TNP FTNGFR 7XHC- 
FINGER METAL-BINDING NU. 


71 

PR0O83O 

ENDOPEPTIDASE LA (LON) SERINE 
PROTFA9F f^lfi"! ^TfTNATTTOF 

PR0O83OA 8.41 8.759e-12 348- 

1AR 

72 

BL00120 

Lipases, serine proteins. 

BL00120B 11.37 2.149e-10 148- 
163 

77 


1 - A NyfrMOP VT^T OPRHPAMR 1 
l "/\iVlllN i v^Lil^rivvJr /vINtl- 1 - 

CARBOXYLATE SYNTHASE 
STGNATTfRF 

rKUU/3ib 6.UJ J.jjze-11 191- 
216 PR00753D 6.85 2.778e-09 

111 K? 

78 

PR0O506 

D21 CLASS N6 ADENINE-SPECIFIC 

DNA MFTRYT TR AN^FFtt A <\F 
SIGNATURE 

PR00506C 19.40 8.017e-O9 96- 

1 1 Q 

82 

BL00107 

Protein kinases ATP-binding region 
proteins. 

BL00107A 18.39 3.571e-16 436- 
467 

84 

BL00675 

SiPTna-S4 intpraftinn Hnmain nrntpinc 

ATP-binding region A proteins. 

RT 0ft67SA 94 R£ R ROOp 10 9^ 

300 

85 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 2.286e-30 117-160 

87 

BL00250 

TGF-beta family proteins. 

BL00250A 21 .24 6.786e-36 264- 
300 BL00250B 27.37 1.450e-26 

119. 1f%A 

91 

.BL00215 

A/Titr»phnnHrial pnpr&v trancfpr nrrxfpin c 
IVXILV/Crliuiiui lax did ti jr Uu.lloi.Ci UIULCUlo. 

FtT 0H91 ^ A 1 ^ 91 Q 9^0*. 17 1H 1^ 

dlajkjl i j/\ U.6Z y.zjue-i / iu-jj 
BL00215A 15.82 6.000e-16221- 

946 RT 0091 1 ^ 89 7 8^7^-19 

108-133 BL00215B 10.44 9.526e- 
11 168-181 

92 

BL00027 

'Homeobox' domain proteins. 

BI 00027 26 43 9 526e-94 394-167 

95 

PR00094 

ADENYLATE KINASE SIGNATURE 

PR00094C 12.94 1.000e-08 1 19- 
136 

96 

PD02327 

GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 

PD02327B 19.84 2.09 le-09 143- 
165 

97 

BL0O752 

XPA protein. 

RI 00759R 1 9 17 7 109<»-09 98-79 

XJLfKJXJ 1 J Z,L3 17.1 / /.JU7C U7 iO 

98 

PR00876 

NEMATODE METALLOTHIONEIN 
SIGNATURE 

PR00876B 7.66 2.268e-10 135- 
149 

99 

PR00109 

TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 

PR00109B 12.27 9.824e-12 122- 
141 

100 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 7.429e-3 1 1 18-161 

101 

BL00028 

Zinc finger, C2H2 type, domain proteins. 

BL00028 16.07 6.870e-12 370-387 
BL00028 16.07 6.885e- 11 398-415 
BL00028 16.07 8.269e-ll 342-359 
BL00028 16.07 4.300e-l 0 229-246 
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ED 
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RESULTS* 1 




BL00028 16.07 6. lOOe- 10 258-275 

102 

PR00048 

C2H2-TYPE ZINC FINGER 
SIGNATURE 

PR00048A 10.52 7.750e-14 665- 
679 PR00048A 10.52 8.500e-14 
581-595 PR00048A 10.52 9.250e- 
14 637-651 PR00048A 10.52 
2.059e- 12 609-623 PR00048A 
10.52 2.588e- 12 469-483 
PR00048A 10.52 7.353e-I2 553- 
567 PR00048A 10.52 2.895e-l 1 
525-539 PR00048A 10.52 4.3 16e- 
11441-455 PR00048A 10.52 
5.263e-l 1413-427 PR00048B 
6.02 2. 125e- 10 569-579 
PR0004RR fi 09 4 Q^Rp-lfl Sl^- 
523 PR00048A 10.52 5.696e-10 
497-51 1 PR00048B 6.02 8.875e- 
10 429-439 PR00048B6.02 
1.000e-09 457-467 PR00048B 
6.02 6.684e-09 485-495 

103 

PR00195 

DYNAMIN SIGNATURE 

PR00195A 11.94 5.364e-22 31-50 

PROfllQSR 9 47 1 7fttp-?1 Sfi 74 

PR00195C 11.50 3.455e-21 126- 
144 PR00195D 11.76 8.71 4e-21 

17S-194 PR00195F \f% 90 R ^OOp- 

20 217-237 PR00I95E9.82 
8.650e-20 194-211 

104 

BL01113 

Clq domain proteins. 

BL01113A 17.99 1.865e-09 121- 

148 RI 01 1 n A 1 7 QQ ^ R46*» HO 

82-109 

105 

BL00420 

Speract receptor repeat proteins domain 

nrntpinc 

BL00420A 20.42 6.400e-l 1 70-99 
RI 00490 A 90 49 R S9Sp-10 73- 

102 BL00420A 20.42 5.708e-09 
85-114 

108 

PR00860 

VERTEBRATE METALLOTHIONEIN 
SIGNATURE 

PR00860B 7.04 2.929e-2027-41 
PR00860A 5.46 5.500e-16 5-18 
PR00860C 9 61 1 474e- 14 41-51 

112 

BL01031 

Heat shock hsp20 proteins family profile. 

BL01031C 17.68 6.400e-10 122- 
147 

114 

DM01840 

kw SPAC24B11.09 R07E5.13. 

DM01840B 22.04 2.688e-40 59- 

10^ DM01R40A 10 9S9 V7Ip-H 
iv/ j lyiviuio'rurt 11/. .z j y .u i j 

31-43 

115 

BL01126 

Elongation factor Ts proteins. 

BL01126A 18.48 2.3 17e-30 46-89 
BL01126B 13.15 7.387e-19 116- 
135 BL01126C9.20 9.735e-ll 
190-203 

116 

BL00216 

Sugar transport proteins. 

BL00216B 27.64 4.375e-21 35-85 

118 

BL00437 

Catala^p nroYimal hpmp-li<yanH nrotpitiQ 

RI 00417A 1 R R9 1 000p-40 49- 
101 BL00437B 16.28 1.000e-40 
114-168 BL00437C 21.86 l.OOOe- 
40 190-239 BI 00437D9S 72 
1.000e-40 248-301 BL00437E 
23 95 1 000e-40 327-379 

119 

BL00140 

IJhinuftin ca rhoYvl-tprnri in a 1 hvrlroIa«?p 

uUl^UHiil Vdl kJ\J/\.j 1 LvlUlUIllI llj UI UiaOH/ 

family 1 cysteine activ. 

RI 00 MOD 7? 64 R ?74p-14 164- 
208 BL00140C 11.80 5.444e-10 
77-102 

120 

BL00224 

Clathrin light chain proteins. 

BL00224B 16.94 6.712e-10 95- 
148 

122 

BL00203 

Vertebrate metallothioneins proteins. 

BL00203 13.94 1.000e-40 16-62 

123 

PR00041 

CAMP RESPONSE ELEMENT 

PR00041D 7.95 2.906e-0924-41 
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BIND FN G (CREB) PROTErN 
SIGNATURE 


124 

PR00041 

CAMP RESPONSE ELEMENT 
BINDING (CREB) PROTEIN 
SIGNATURE 

PR00041D 7.95 2.906e-09 24-41 

125 

BL00061 

Short-chain dehydrogenases/reductases 
family proteins. 

BL0006 1 C 7.86 3 .250e- 10 2 12- 
222 

126 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 6.400e-25 251-290 

127 

PR00318 

ALPHA G -PROTErN (TRANSDUCING 
SIGNATURE 

PR00318D 16 98 1 Q00p-14?1Q- 
248 PR00318B 14.79 3.455e-27 
168-191 PR003 1 8C 12 09 7 OOOe- 
23 197-215 PR00318A7.84 
L600e-19 35-51 PR00318E7.23 
2.500e-12 265-275 

128 

PR00927 

ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 

PR00927E 14.93 9.743e- 10 67-89 
PR00927B 14.66 4.575e-09 69-91 

130 

BL00824 

Elongation factor 1 beta/beta'/delta chain 
nroteins 

BL00824B 9.21 7.750e-22 133- 
153 

131 

BL00824 

Elongation factor 1 beta/betaVdelta chain 
nroteins 

BL00824C 14.58 1.000e-40 166- 
204 RT O0R94D 14 04 1 6?1p-^8 
204-239 BL00824B9.21 7.750e- 
22 133-153 BL00824E 12 49 
1.000e-19 247-263 

132 

PR00209 

ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 

PR00209B 4.88 9.222e-13 1209- 
1228 

133 

PR00209 

ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 

PR00209B 4.88 9.222e-13 1 168- 
1187 

134 

PR00708 

AT PHA-l-APIH HT YPOPR OTFfM 

SIGNATURE 

Jriwu f\Jou ih.o/ i.uuue-z/ l^ti- 
168 PR00708C 11.77 1.643e-25 
98-120 PR00708B 15 15 2 174e- 
24 73-95 PR00708E 13.33 
1 600e-21 189-207 PR00708A 
14.40 2.636e-21 51-70 

135 

PR00109 

TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 

PR00109B 12.27 8.468e-13 126- 
145 

136 

PF00023 

Ank repeat proteins. 

PF00023A 16.03 3.250e-10 201- 
217 

137 

BL00471 

Small cytokines (intercrine/chemokine) 
C-x-C subfamily signat. 

BL00471 23.92 7.480e-10 42-90 

1 140 

PR00205 

CADHERIN SIGNATURE 

PR00205B 1 1.39 5.582e-10 328- 
346 PR00205B 11.39 9.01 8e-10 
543-561 

141 

BL00412 

Neuromodulin (GAP-43) proteins. 

BL00412D 16.54 7.704e-09 976- 
1027 

143 

PR00979 

TAFAZZIN SIGNATURE 

PR00979E 10.83 5.950e-26 192- 
214 PR00979A 11 91 8 773e-25 
63-83 PR00979C 12.16 6.400e-19 
108-124 PR00979D 12 38 7 955 e- 
19 170-185 PR00979F 10.14 
3.382e-15 230-244 PR00979B 
15.59 5.636e-15 94-106 

145 

DM00686 

kw REPLICATION REP 28K 17.7K. 

DM00686C 14.14 7.720e-09 111- 
131 

146 

PR00604 

CLASS IA AND IB CYTOCHROME C 
SIGNATURE 

PR00604D 15.86 1.000e-17 87- 
104 PR00604B 12.73 9.591 e- 1 6 
57-73 PR00604C 10.21 8.200e-12 
73-84 PR00604E 10.13 1.000e-ll 
106-117 PR00604A 11.13 8.800e- 
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11 44-52 PR00604F 8.60 l.OOOe- 
10 123-132 

147 

BL00107 

Protein kinases ATP-binding region 
proteins. 

BL00107A 18.39 3.864e- 1 5 266- 
297 BL00107B 13.31 6.143e-ll 
335-351 

148 

PD00289 

PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 

PD00289 9.97 8.448e-09 67-81 

149 

PR00069 

ALDO-KETO REDUCTASE 
SIGNATURE 

PR00069D 19.36 l.857e-30 187- 
217 PR00069A 16.01 7.429e-25 
41-66 PR00069E 18.14 3.100e-22 
235-260 PR00069C 16.03 7.000e- 
20 151-169 PR00069B 11.33 
8.071e-19 101-120 

150 

BL00027 

'Horneobox' domain proteins. 

BL00027 26.43 2.688e-27 139-182 

151 

PD02906 

SYNTHASE I PSEUDOURJDYLATE 
PSEUDOURIDINE LYASE TR. 

PD02906C 24.17 7.070e-22 165- 
200 PD02906B 15.35 8.393e-15 
114-127 PD02906A 10.84 6.500e- 
09 71-84 

153 

BL00479 

Phorbol esters / diacylglycerol binding 
domain proteins. 

BL00479A 19.86 5.091e-12 891- 
914 BL00479B 12.57 1.837e-ll 
915-931 

158 

BL00027 

'Horneobox' domain proteins. 

BL00027 26.43 6.786e-31 143-186 

160 

BL00422 

Granins proteins. 

BL00422C 16.18 7.750e-12 420- 
448 

162 

PR00625 

DNAJ PROTEIN FAMILY 
SIGNATURE 

PR00625A 12.84 9.297e-l 1 62-82 

164 

BL01282 

BIR repeat proteins. 

BL01282B 30.49 6.182e-10 347- 
386 

166 

PR00860 

VERTEBRATE METALLOTHIONEIN 
SIGNATURE 

PR00860B 7.04 2.929e-20 83-97 
PR00860A 5.46 1.000e-18 61-74 
PR00860C9.61 L900e- 15 97-107 

167 

PR00449 

TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 

PR00449A 13.20 7.052e-09 196- 
218 

169 

BL00514 

Fibrinogen beta and gamma chains C- 
terminal domain proteins. 

BL00514C 17.41 1.346e-39 316- 
353 BL00514G 15.98 2.24 le-34 
471-501 BL00514H 14.95 6.57 le- 
27 510-535 BL00514E 14.28 

I 273e-l 6 388-405 BL00514D 
15.35 9. 100e-l 5 369-382 
BL00514B 16.42 4.857e-14 260- 
276 BL00514F11.65 9.690e-14 
416-431 BL00514All.68 8.200e- 

II 149-159 

170 

BL00514 

Fibrinogen beta and gamma chains C- 
terminal domain proteins. 

BL00514C 17.41 1.346e-39 268- 
305 BL00514G 15.98 2.241e-34 
423-453 BL00514H 14.95 6.571e- 
27 462-487 BL00514E 14.28 

I 273e-16 340-357 BL00514D 
15.35 9.100e-15 321-334 
BL00514B 16.42 4.857e-14 212- 
228 BL00514F 1 1.65 9.690e-14 
368-383 BL00514A 11.68 8.200e- 

II 101-111 

171 

BL00514 

Fibrinogen beta and earnma chains C- 
terminal domain proteins. 

BL00514G 15.98 2.241e-34 385- 
415 BL00514H 14.95 6.571e-27 
424-449 BL00514C 17.41 4.632e- 
24 230-267 BL00514E 14.28 
1.273e-16 302-319 BL00514D 
15.35 9. 100e-15 283-296 
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BL00514B 16.42 4.857e- 14 212- 
228 BL00514F 1 1.65 9.690e-14 
330-345 BL00514A 11.68 8.200e- 
11 101-111 

173 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 9.400e-29 1 19-162 

174 

DM01970 

Okw ZK632.12YDR313C 
ENDOSOMAL III. 

DM0l970B 8.60 5.119e-15 1391- 
1404 

176 

BL00773 

Chitinases family 19 proteins. 

BL00773C 9.42 8.000e-09 2-16 

182 

PR00109 

TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 

PR00109B 12.27 9.163e-14 141- 
160 

183 

PD01937 

DNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA-. 

PD01937A 6.68 3.475e-09 221- 
232 

185 

BL00845 

CAP-GIy domain proteins. 

BL00845 16.43 2.946e-23 247-272 
BL00845 16.43 1.628e-21 107-132 

186 

PR00452 

SH3 DOMAIN SIGNATURE 

PR00452B 1 1.65 6.53 8e-l I 525- 
541 

| 187 

PR00452 

SH3 DOMAIN SIGNATURE 

PR00452B 1 1.65 6.53 8e-l 1 497- 
513 

188 

DM01803 

1 HERPESVIRUS GLYCOPROTEIN H. 

DM0 1803 A 10.51 ].000e-09 
1081-1102 

189 

PF00651 

BTB (also known as BR-C/Ttk) domain 
proteins. 

PF00651 15.00 5.091e-15 69-82 

190 

PR00194 

TROPOMYOSIN SIGNATURE 

174 PR00194E8.74 3.250e-30 
231-257 PR00194D9.57 1.500e- 
?6 175-190 PR001Q4R 10 94 
5.200e-24 120-141 PR00194A 
7 86 4 857e-21 84-10^ 

192 

PD02042 

IRON-SULFUR ELECTRON 
TRANSPORT AROMATIC 
HYDROCARB. 

PD02042B 16.75 5.154e-09 131- 
146 PD02042A21 13 5 909e-09 
94-121 

193 

PR0002I 

SMALL PROLINE-RICH PROTEIN 
SIGNATURE 

PR00021A4.31 2.200e-10 2-15 

195 

BL00463 

Fungal Zn(2)-Cys(6) binuclear cluster 
domain proteins. 

BL00463 8.22 5.071e-09 1 11-123 

196 

PR00118 

BETA-LACTAMASE CLASS A 
SIGNATURE 

PR00118F 16.42 9.386e-09 165- 
181 

197 

DM00215 

PROLINE-RICH PROTEIN 3. 

DM00215 19.43 5.424e-09 234- 
267 

198 

. BL00660 

Band 4.1 family domain proteins. 

BL00660A 3 1.50 5.500e-l 1 714- 
767 

199 

BL00282 

Kazal serine protease inhibitors family 
proteins. 

BL00282 16.88 8.820e-13 70-93 

202 

PR00009 

TYPE I EGF SIGNATURE 

PR00009A 14.15 5.345e-15 971- 
987 PR00009C 14.1 1 8.773e-13 
996-1008 PR00009D 16.83 
8.000e-ll 1008-1018 PR00009C 
14.11 L882e-09 892-904 

203 

BL00025 

P-type Trefoil' domain proteins. 

BL00025 17.17 4.536e-19 38-59 

205 

BL000I8 

EF-hand calcium-binding domain 
proteins. 

BL00018 7.41 7.300e-10 165-178 

206 

PR00168 

SLOW VOLTAGE-GATED 
POTASSIUM CHANNEL SIGNATURE 

PR00168D 12.88 6.865e-l 1 67-86 

207 

BL00025 

P-type Trefoil' domain proteins. 

BL00025 17.173.423e-20 39-60 
BL00025 17.17 8.750e-16 88-109 

209 

BL00646 

Ribosomal protein S13 proteins. 

BL00646B 21.42 6.100e-30 110- 
143 BL00646A25.82 6.192e-29 
14-62 

210 

PR00138 

MATRIX IN SIGNATURE 

PR00138D 16.56 3.605e-25 279- 
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305 PR00138C 16.41 3.000e-24 
218-247 PR001 38E 6.01 8.7 Re- 
tt 314-328 PR00138A 15.14 
9 538e-13 134-148 PR0O138B 
15.82 4.522e-12 188-204 

211 

DM01206 

CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 

DM01206B 10.69 8.429e- 12 386- 
406 DM01206B 10.69 1.247e-10 
384-404 DM01206B 10.69 
5.068e-10 388-408 

212 

PD01941 

TRANSMEMBRANE 
COTRANSPORTER SYMP. 

PD01941A 14.81 1.000e-40 163- 
217 PD01941B 15.02 9.705e-30 

490-467 PD01Q41F 1 S Q? R 714p- 
23 837-884 PD01941C 19 96 
8.200e-20 508-563 PD01941D 
27.18 1.600e-I6 661-710 
PD01941F28 52 9 645e-15 1005- 
1060 

213 

BL00362 

Ribosomal protein SI 5 proteins. 

BL00362 24.67 8.3 13e-09 330-373 

214 

BLOO 1 1 5 

Pnkarvntir RNA nrtlvmpraip TI 

heptapeptide repeat proteins. 

RT 001 1 ^ 1? 7 1?Sf»-09 1 17R- 
1 227 BLOO 1 1 5Z 3 . 1 2 6.096e-09 
1164-1213 

215 

BL00038 

JVljr^ ly^jKZj HOllA lUUjJ 11CJ1A UJJIICJ 1zjCXv1\Jx\ 

domain proteins. 

RT 0001KR 16 Q7 1 600p-1 R 
146 BL00038A 13.61 1.474e-13 
102-1 18 

216 

BL01108 

Ribosomal protein L24 proteins. 

BL01 108A 20.33 2.241e-22 49-82 
BL01108B 11.40 8.457e-10 96- 
107 

217 

PR00381 

KINESIN LIGHT CHAIN SIGNATURE 

PR00381A 9.55 1.321e-10 360- 
378 

222 

BL00514 

Fibrinogen beta and gamma chains C- 
terminal aomain proiemb. 

BL00514C 17.41 2.358e-26 1166- 
i7fn rt nns ]Ad i s qr q nnnp» i ^ 

1289-1319 BL00514D 15.35 
6.936e-12 1207-1220 BL00514F 
1 1 6S 4 ?RRp-10 PS1-I76R 
BL00514H 14.95 8.636e-10 1318- 
1343 

223 

BL00325 

Actin-Henolvmeri7tn& nmteins 

BL00325B 21 66 1 000e-40 93- 
139 BL00325A 24.83 9.333e~24 
61-93 

224 

BL00018 

EF-hand calcium-binding domain 
Droteins 

BL0001S7.41 1.450e-10 23 1-244 

225 

PF01329 

Pterin 4 alpha carbinolamine dhydratase. 

PF01329B 18.52 1.692e-18 67-92 

228 

BL00211 

ABC Iran snorters familv nroteins 

nUV/ U Ull JL/vi IVfl O LUlllXlJ IJl UlvUlO. 

BL00211B 13 37 6 250e-18 1033- 
1065 BL00211B 13.37 8.875e-18 
2045-2077 BL00211A 12.23 
1.900e-09 93 1-943 

230 

PR00761 

BIND IN PRECURSOR SIGNATURE 

PR00761A 5.81 9.366e-09 275- 
292 

231 

PR00049 

WILM'S TUMOUR PROTEIN 
SIGNATURE 

PR00049D 0.00 3.500e-I0 54-69 

232 

BL00412 

Neuromodulin (GAP-43) proteins. 

BL00412D 16.54 1.978e-10 109- 
160 BL00412D 16 54 4 I22e-09 
133-184 

233 

BL01210 

Caveolins proteins. 

BL01210B 13.92 8.129e-09 106- 
156 

236 

BL00939 

Ribosomal protein Lie proteins. 

BL00939F 17.27 5.393e-09 861- 
891 

238 

BL01252 

Endogenous opioids neuropeptides 
precursors proteins. 

BL01252D 18.25 3.571e-28 205- 
233 BL01252B 19.09 5.034e-27 
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37-67 BL01252C 18.10 1.621e-21 
164-190 BL01252A 14.22 7.107e- 
18 14-34 

239 

BL00302 

Eukaryotic initiation factor 5 A hypusine 
proteins. 

BL00302 14.81 1.000e-40 25-79 

240 

PR00420 

AROMATIC-RING HYDROXYLASE 
(FLAVOPROTEIN 
MONOOXYGENASE) SIGNATURE 

PR00420A 14.78 8.851e-13 26-49 

241 

PD02929 

ADHESION GLYCOPROTEIN 
PRECURSOR I. 

PD02929A 28.27 4.529e-09 235- 
289 

243 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 8.527e-25 1 1-50 

244 

BL01270 

Band 7 protein family proteins. 

BL01270C 16.91 6.745e-17 115- 
144 BL01270B 18.746.857e-17 
76-115 BL01270E 13.03 6.016e- 
15 182-211 BL01270D 20.87 
9.160e-13 144-182 

245 

PF00791 

Domain present in ZO-1 and Unc5-like 
netrin receptors. 

PF00791B 28.49 6.305e-12 253- 
308 PF00791B 28.49 1.909e-l 1 
427-482 PF00791B 28.49 2.651 e- 
09 179-234 PF00791B 28.49 
3.890e-09 112-167 

246 

PD00066 

PROTEIN ZINC-FINGER METAL- 
BINDI. 

PD00066 13.92 2.500e- 13 277-290 
PD00066 13.92 9.143e-12 1 93-206 
PD00066 13.92 5.304e-ll 165-178 
PD00066 13.92 6.478e-l 1 249-262 
PD00066 13.92 3.423e-10 221-234 

247 

BL00406 

Actins proteins. 

BL00406D 12.58 6.400e-2O 465- 
520 BL00406B5.474.857e-14 
249-304 BL00406E8.44 l.OOOe- 
11 522-572 BL00406C6.75 
5.449e-ll 313-368 

248 

BL00951 

ER lumen protein retaining receptor 
proteins. 

BL00951C 19.35 1.000e-40 112- 
161 BL00951A 15.10 7.750e-39 
21-57 BL0095 ID 13.94 6.000e-38 
161-196 BL00951B 14.23 3.100e- 
31 57-88 

252 

BL01113 

Clq domain proteins. 

BL01113A 17.99 9.129e-15 200- 
227 BL01113A 17.99 4.818e-14 
194-221 BL01113A 17.99 7.81 8e- 
14 182-209 BL01113A 17.99 
1.730e-13 185-212 BL01113A 
17.99 6.595e-13 191-218 
BL01 1 13A 17.99 6.077e-12 203- 
230 BL01113A 17.99 9.1 82e-l 1 
179-206 BL01113A 17.99 2.532e- 
10 176-203 BL01113A 17.99 
9.043e-10 218-245 BL01113A 
17.99 9.426e-10 209-236 
BL01113A 17.99 4.1 15e-09 137- 
164 

257 

BL00845 

CAP-GIy domain proteins. 

BL00845 16.43 1.837e-21 466-491 

259 

PR00248 

METABOTROPIC GLUTAMATE 
GPCR SIGNATURE 

PR00248G 12.67 2.688e-09 53-78 

260 

BL00678 

Trp-Asp (WD) repeat proteins proteins. 

BL00678 9.67 3.400e-10 441-452 
BL00678 9.67 5.800e-10481-492 
BL00678 9.67 8.800e-10 358-369 

261 

BL00678 

Trp-Asp (WD) repeat proteins proteins. 

BL00678 9.67 3. 400e-10 415-426 
BL00678 9.67 5.800e-10 455-466 
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BL00678 9.67 8.800e-10 332-343 

262 

BL00678 

Trp-Asp (WD) repeat proteins proteins. 

BL00678 9.67 3.400e-10 468-479 
BL00678 9.67 5.800e-10 508-519 . 
BL00678 9.67 8.800e-10 385-396 



^rr hnmnlnov % fST-T"3^ riomain nroteins 
profile. 

BL50002B 15.18 2.200e-10 415- 
429 

764 

ZOn 

rt non4Q 

Rihn<;nmal nrntpin T 14 nroteins 

BL00049C 17.38 3.040e-12 94- 
130 

265 

PD01469 

GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 

PD01469 20.59 2.091e-14 438-470 

266 

PD01469 

GLYCOPROTEIN PROTEIN 
PRECURSOR SA. 

PDO 1 469 20 .59 2 .09 1 e- 1 4 279-3 1 1 

ZO/ 

dLuUjO / 

DV» /-» c r>Vi l Villi /"»!/" 1T1 Q<!<* KirTlt »"M M C 

rnospnonDuiOKiiidbe proicuio. 

RI 00567 A 10 66 1 161e-12 36-55 

7£Q 

zoy 

rt nnn/do 

PtKnortmiil nrntpin T 1 4 rkrntfMTIC 
IVlDObt/IIlal piOlClll JLilH piULClJlb. 

BL00049C 17.38 2.688e-28 92- 
128 BL00049B 18.42 6.806e-24 
54-86 BL00049A 13.86 8.333e-19 
19-42 BL00049D 13.47 5.765e-12 
129-140 

272 

BL01115 

GTP-binding nuclear protein ran proteins. 

BL01115A 10.22 9.735e-12 14-58 

Z / J 

rKUUUZ 1 

QNyf AT T PP CW HsJP-P TPT4 PR OTFTN 

SIGNATURE 

PR00021A4 31 1 911e-09 819- 
832 

275 

PR00179 

LIPOCALIN SIGNATURE 

PR00179B 9.56 2.895e-13 124- 
137 PR00179A 13.78 3.250e-l 1 
36-49 PR00179C 19.02 6.040e-ll 

i jt i / \j 

276 

PR00449 

TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 

PR00449A 13.20 8.3 64e- 17 22-44 
PR00449C 17.27 l.OOOe- 13 62-85 
PR00449E 13.50 4.000e-12 172- 
195 PR00449B 14.34 5.680e-10 

277 

BL00140 

Ubiquitin carboxyl- terminal hydrolase 
family I cysteine activ. 

BL00140D 22.64 1.000e-40 161- 
7f)S RT 00140C 1 1 80 9 053e-30 
79-104 BL00140A 15.96 9.400e- 
28 5-35 BL00140B 12.29 4.649e- 
17 37-55 

Z / o 

PD07717 
i\J\)L / 1Z 

FT FMFNT TP AN^POSASF FOR 
TRANSPOSON TRANSPOSABLE. 

PD02712A 23.03 8.013e-09 47-83 

279 

BL00678 

Trp-Asp (WD) repeat proteins proteins. 

BL00678 9.67 1.474e-09 100-111 

ZoZ 

L/lVlUUo!7Z 

^ PFTROVTR AT PPOTFTNA^F 

DM00892C 23 55 4 767e-21 864- 
898 



Prntpminp PI nrnfpinQ 
X 1 UlaLLllllC' JT 1 pUJlGlllo. 

BL00048 6.39 9.550e-09 56-83 

286 

PR00081 

GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 

PR00081A 10.53 1.878e-ll 36-54 

787 
Zo / 


ANTI-PROT TFFRATTVF PROTEIN 
BTG1 FAMILY SIGNATURE 

PR00310B 10.59 4.23 le- 17 29-59 
PR00310D 9.10 6.679e-16 89-1 19 

7R0 
zoy 

JTlJVJ I woo 

PROTFTN 7TNC FINGFR ZINC- 
FINGER METAL-BINDING NU. 

PDO 1066 19.43 7.000e-36 37-76 

293 

BL00979 

G-protein coupled receptors family 3 

pi UlGllla. 

BL00979L 20.63 3.800e-12 111- 
152 

295 

PD02411 

PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 

PD0241 1 21.89 7.000e-16 195-229 

296 

BL01064 

Pyridoxamine 5*-phosphate oxidase 
nroteins 

L/E V/LvUlJ. 

BL0I064A 27.84 8.313e-28 77- 
129 BL01064C 15.22 7.1 36e-25 
202-235 

297 

BL00030 

Eukaryotic RNA-binding region RNP-1 
proteins. 

BL00030A 14.39 2.929e-13 37-56 
BL00030B7.03 1.900e-ll 167- 
177 BL00030A 14.39 2.000e-10 
128-147 
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298 

BL01183 

ubiE/COQ5 methyltransferase family 
proteins. 

BL01183B 21.31 6.660e-12 143- 
188 

299 

BL01279 

Protein-L- isoaspartate(D-aspartate) O- 
methyltransferase signa. 

BL01279A 24.27 5.862e-l 1 57- 
105 

301 

BL00191 

Cytochrome b5 family, heme-binding 
domain proteins. 

BL00191K 17.38 4.951e-27 184- 
228 BL00I91J 11.37 6.447e- 17 
128-150 

302 

DM00892 

3 RETROVIRAL PROTEINASE. 

DM00892C 23.55 3.893e-16 33-67 

306 

PF01140 

Matrix protein (MA), pi 5. 

PF01 HOD 15.54 2.988e-09 416- 
451 

307 

PR00245 

OLFACTORY RECEPTOR 
SIGNATURE 

PR00245A 18.03 4.818e-21 59-81 j 
PR00245C 7.84 5. 154e-20 238- 
254 PR00245D 10.47 4.000e- 15 
274-286 PR00245B 10.38 8.200e- 
15 177-192 PR00245E 12.40 
5.714e-12 291-306 

309 

BL00203 

Vertebrate metallothioneins proteins. 

BL00203 13.94 2.245e-10 612-658 

310 

BL00237 

G-protein coupled receptors proteins. 

BL00237A 27.68 7.632e-23 1 19- 
159 BL00237C 13.19 3.864e- 15 
251-278 BL00237Dll.23 3.739e- 
12 312-329 

311 

BL00380 

Rhodanese proteins. 

BL00380D 15.90 8.200e-28 1 10- 
136 BL00380G 1 1.26 5.800e-16 
267-280 RT OO^KOR 14 77 7 flOOp- 
14 49-62 BL00380F9.76 5.886e- 
13 203-214 BL00380C 15.67 
7.3 87e- 13 82-98 BL0O38OE 12.44 
7.000e-ll 181-193 BL00380A 
10.48 1.000e-09 10-20 

312 

BL00227 

Tubulin subunits alpha, beta, and gamma 
proteins. 

BL00227B 19.29 LOOOe-40 50- 
1 05 BL00227C 25 48 1 000e-40 
111-163 BL00227D 18.46 l.OOOe- 
40 220-274 BL00227F 21.16 
1. 000e-40 372-426 BL00227A 
24.55 3.250e-39 1-35 BL00227E 
24.15 8.500e-34 324-359 

327 

BL00232 

Cadherins extracellular repeat proteins 
domain proteins. 

BL00232B 32.79 7.362e-21 225- 
273 BL00232B 32.79 2.588e-17 
435-483 BL00232B 32.79 6.301 e- 
15 116-164 BL00232B 32.79 
6.769e- 13 330-378 BL00232C 
10.65 9.341e-12 223-241 
BL00232C 10.65 5.696e-l 1 328- 
346 BL00232C 10.65 3.942e-10 
433-451 

329 

PD02749 

TRANSCRIPTION PROTEIN FACTOR 
BTF3 REGULATION NUCL. 

PD02749B 12.75 2.241e-37 35-71 
PD02749C 13.96 4.892e-28 87- 
121 PD02749A 9.56 6.000e-15 2- 
15 

330 

PR00391 

PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN SIGNATURE 

PR00391E 12.50 7.785e-15 211- 
231 PR00391B8.39 1.000e-13 
83-104 PR00391D 12.21 9.328e- 
13 191-207 PR00391A7.83 
5.390e-ll 16-36 

332 

BL01030 

RNA polymerases M / 15 Kd subunits 
proteins. 

BL01030 23.44 L818e-23 87-125 

337 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 2.929e-32 6-45 

340 

PD02711 

SYNTHASE 

PD0271IB 14.26 1.973e-20 944- 
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PHOSPHOR1BOSYLFORMYLGLY. 

968 

343 

BL00223 

Annexins repeat proteins domain 
proteins. 

BL00223C 24.79 l.OOOe-40245- 
300 BL00223B 28.47 8.7 14e-38 
168-218 BL00223A 15.59 8.250e- 
27 98-132 BL00223A 15.59 
8.750e-27 26-60 BL00223C 24.79 
9.438e- 16 13-68 BL00223C 24.79 
2.735e-15 85-140 BL00223A 
15.59 2.253e-ll 258-292 

346 

PR00345 

STATHM1N FAMILY SIGNATURE 

PR00345B 7.12 2.800e-28 81-1 10 
PR00345E 8.54 7.652e-28 158- 
183 PR00345C4.54 9.100e-28 
110-134 PR00345D 10.97 1.964e- 
24 134-158 PR00345A 13.46 
5.645e- 16 52-71 

347 

BL00586 

Ribosomal protein LI 6 proteins. 

BL00586B 17.00 3.215e-15 184- 
221 

348 

PR00388 

3\5'-CYCLIC NUCLEOTIDE CLASS II 
PHOSPHODIESTERASE SIGNATURE 

PR00388A 10.45 2.778e-09 86- 
105 

351 

BL00018 

EF-hand calcium-binding domain 
proteins. 

BL00018 7.41 3.118e-ll 160-173 
BL00018 7.41 2.350e-10 244-257 

354 

BL00678 

Trp-Asp (WD) repeat proteins proteins. 

BL00678 9.67 1.947e-09 256-267 

358 

DM01206 

CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 

DM01206B 10.69 3.278e-09 175- 
195 DM01206B 10 69 6 696e-09 
183-203 DM01206B 10.69 
8.633e-09 132-152 DM01206B 
10.69 8.861e-09 181-201 
DM01206B 10.69 9.316e-09 177- 
197 

361 

PD01498 

OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 

PD01498C 24.90 6.880e-14 219- 
263 

362 

PD01498 

OXIDASE BIOSYNTHESIS 
OXIDOREDUCTASE PORP. 

PD01498C 24.90 6.880e-14 219- 
263 

365 

BL00178 

Aminoacyl-transfer RNA synthetases 
class-I proteins. 

BL00178B 7.11 1.000e-l 1 589- 
600 BL00178A 14.23 8.500e-09 
46-56 

366 

BL00523 

Sulfatases proteins. 

BL00523E 19.27 1.000e-23 318- 
348 BL00523A 13.36 5.500e-16 
30-47 BL00523B8.64 1.964e-13 
78-90 BL00523C 12.64 9.625e-13 
129-140 BL00523G 9.46 5.500e- 
10 506-516 

369 

BL00107 

Protein kinases A TP-binding region 
proteins. 

BL00107A 18.394.818e-0921-52 

370 

BL00880 

Acyl-CoA-binding protein. 

BL00880 17.52 1.000e-40 75-125 

371 

BL00107 

Protein kinases ATP-binding region 
proteins. 

BL00107A 1 8.391. 000e-23 276- 
307 BL00107B 13.31 1.692e-12 
342-358 

372 

PR00211 

GLUTELIN SIGNATURE 

PR0021 IB 0.86 6.602e-l 1 326- 
347 PR00211B0.86 6.106e-10 
320-341 PR0021 IB 0.86 3.167e- 
09 333-354 

373 

BL00279 

Membrane attack complex components / 
perforin proteins. 

BL00279E 37.1 1 9.349e-10 749- 
797 

375 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 1.231e-33 10-49 

377 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 7.563e-28 10-49 

379 

BL00598 

Chromo domain proteins. 

BL00598 14.45 5.78 le-16 3-25 
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380 

PR00413 

HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 

PR00413D 1 1.28 8.941e-09 864- 
878 

383 

PR00413 

HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 

PR00413D 11.28 8.941e-09 864- 
878 

387 

BL01060 

Flagella transport protein fliP family 
proteins. 

BL01060A 15.65 1.535e-09 131- 
174 

388 

PR00209 

ALPHA/BETA GLIADIN FAMILY 
SIGNATURE 

PR00209B 4.88 6.3 18e-ll 1009- 
1028 

389 

PR00837 

ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 

PR00837B 11.64 I.OOOe-10469- 
483 

391 

BL00240 

Receptor tyrosine kinase class III 
proteins. 

BL00240B 24.70 7.907e-10 118- 
142 

392 

PR00014 

FIBRONECTIN TYPE III REPEAT 
SIGNATURE 

PR00014D 12.04 8.412e-10 691- 
706 

393 

PR00014 

FIBRONECTIN TYPE III REPEAT 
SIGNATURE 

PR00014D 12.04 8.412e-10 706- 
721 

394 

BL01209 

LDL-receptor class A (LDLRA) domain 
proteins. 

BL01209 9.31 3.368e-15 47-60 
BL01209 9.31 5.500e-13 92-105 1 

395 

BL00634 

Ribosomal protein L30 proteins. 

BL00634 34.38 4.090e-13 70-121 

396 

BL01013 

Oxysterol-binding protein family 
proteins. 

BL0I0I3D 26.81 8.000e-26 358- 
402 BL01013A25.14 7.231e-21 
45-81 BL01013C9.97 L000e-13 
132-142 BL01013B 11.33 l.OOOe- 
11 110-121 

397 

BL00930 

Peripherin / rom-1 proteins. 

BL00930E 17 80 1 000e-40 56-92 
BL00930D 9.12 4.632e-37 12-56 
BL00930F 16.91 2.800e-36 92- 
133 

400 

PR00780 

LEUSERPIN 2 SIGNATURE 

PR00780B 4.89 4.491e-09 262- 
285 

401 

PR00819 

CBXX/CFQX SUPERFAMILY 
SIGNATURE 

PR00819B 10.83 7.158e-ll 4-20 

403 

BL00381 

Endopeptidase CIp serine proteins. 

BL00381C 23.84 L250e-32 150- 
194 BL00381A 16.48 2.286e-22 
74-111 BL00381B 21.42 8.326e- 
14 78-130 

405 

BL01105 

Ribosomal protein L35Ae proteins. 

BL01105A 17.37 1.000e-404-49 
BL01105B 12.95 LOOOe-40 68- 
108 

406 

BL00344 

GATA-type zinc finger domain proteins. 

BL00344 17.99 7.000e-12 814-852 

407 

PR00211 

GLUTELIN SIGNATURE 

PR0021 IB 0.86 9.750e-09 73-94 

409 

PR00910 

LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 

PR00910A2.51 4.321e-09 9-22 

410 

BL00762 

WHEP-TRS domain proteins. 

BL00762A 23.43 1.000e-28 752- 
789 BL00762A 23.43 4.400e-21 
903-940 BL00762A 23.43 5.4 15e- 
18 825-862 BL00762B 16.14 
8.759e-12 1154-1 168 

412 

BL00690 

DEAH-box subfamily ATP-dependent 
helicases proteins. 

BL00690B 13.38 5.320e-15 262- 
280 BL00690A6.87 1.818e-13 
230-240 

415 

BL00227 

Tubulin subunits alpha, beta, and gamma 
proteins. 

BL00227B 19.29 1.000e-40 52- 
107 BL00227C 25.48 L000e-40 
113-165 BL00227D 18.46 l.OOOe- 
40 222-276 BL00227F 21.16 
1.000e-40 382-436 BL00227E 
24.15 1.750e-34 326-361 
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BL00227A 24.55 1.000e-33 1-35 

416 

PF00992 

Troponin. 

PF00992A 16.67 1.71 le-09 557- 
592 

418 

BL00541 

Nuclear transition protein 1 proteins- 

BL00541 8.44 9.875e-09 256-310 

419 

BL00541 

Nuclear transition protein 1 proteins. 

BL00541 8.44 9.875e-09 197-251 

420 

PF00856 

SET domain proteins. 

PF00856A 26.14 9.074e-13 901- 
938 PF00856B 16.42 2.397e-12 
951-973 

421 

BL00678 

Trp-Asp (WD) repeat proteins proteins. 

BL00678 9.67 8.200e-12 33-44 

423 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING Nil. 

PD01066 19.43 8.600e-30 130-169 

424 

PF00564 

Octicosapeptide repeat proteins. 

PF00564B 24.74 1.305e-17 421- 

472 ! 

426 

PR00988 

URIDINE KINASE SIGNATURE 

PR00988A 6.39 4.569e-12 3-21 

427 

PR0O988 

URIDINE KINASE SIGNATURE 

PR00988A 6.39 4.569e-12 3-21 

428 

BL00478 

LIM domain proteins. 

BL00478B 14.79 3.250e- 13 115- 
130 BL00478B 14.79 9.036e- 13 
50-65 

431 

BL00282 

Ka7^il serine nrotease inhibitors familv 

lVUi^ul Jvl lltv Lfl flvujv jjiuj.n/ii»vn o 1 Hill 11 jr 

proteins. 

BL00282 16.88 8.875e-12 464-487 

432 

PD00930 

k 1/wvyJV 

PROTEIN GTPASE DOMAIN 
ACTIVATION. 

PD00930B 33.72 7.800e-18 316- 
357 PD00930A 25.62 9.617e-12 
125-151 PD00930B 33.72 2.52 le- 
10 214-255 

433 

PD0 1066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19 43 4 649e-34 34-73 

434 

PR00449 

TRANSFORMING PROTEIN P21 RAS 
SIGNATURE 

PR00449A 13.20 7.563e- 11 56-78 

436 

PR00120 

H+-TRANSPORTING ATPASE 
(PROTON PUMP) SIGNATURE 

PR00120C 9.90 5.800e-19 705- 
722 

437 

BL00115 

Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 

BL001 15T 8.45 7.273e-29 1208- 
1242 BL00115Q 18.08 2.776e-21 
953-983 BL00115Y 11.86 8.000e- 
17 1604-1650 BL00115M 19.19 
8. 130e- 16 731-774 BL00115H 
14.34 9.392e- 16 463-496 
BL00115A 15.44 7.4 14e- 15 43-82 
BL001 15R 6.50 6.128e-14 983- 
1010 BL00115J 16.71 9.289e-14 
591-617 BL001 151 8.33 4.336e- 
13 535-590 BL00115L 12.25 
5.939e-l 3 662-694 BL00115G 
11.65 6.01 le-13 435-463 
BL00115K 15 03 3 417e-10 617- 
659 BL001 ISO 16.76 5.805e-10 
863-913 BL00115P 11.54 7.538e- 
10 913-953 BL00115S 18.24 
7.968e-10 1010-1052 BL00115U 
10.34 4.475e-09 1242-1265 

438 

PF00628 

PHD-finger. 

PF00628 15.84 4.536e-10 219-234 

440 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 6.351e-34 10-49 

441 
i 



PR0m09A 9 68 S ?^0e-24 32-55 
PR00309D 7.09 4.938e-23 290- 
309 PR00309B7.81 2.800e-21 
69-88 PR00309C8.22 1.621e-19 
165-183 PR00309E 9.82 9.438e- 
15 374-389 

442 

BL00600 

Aminotransferases class-III pyridoxal- 

BL00600B 19.60 7.324e-14 103- 
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phosphate attachment si. 

129 BL00600G 12.43 2.125e-12 
306-^7 S RI 00600F R 77 R ia^p 
12 271-284 BL00600E 16.43 
3.167e-l 1 228-257 BL00600D 
8.71 8.650e-09 207-221 

443 

BL00972 

Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 

BL00972 A 1 1 .93 3 . 1 60e- 1 8 69-87 

444 

BL00349 

CTF/NF-I proteins. 

BL00349A 10.07 1.000e-40 8-54 
BL00349C9.33 L000e-40 82-125 
BL00349E 10.79 L000e-40 152- 

1 OS RT fMV*4QF 1 1 R 1 1 ftf)fV_4n 

213-255 BL00349H 15.70 7.387e- 
36 361-399 BL00349B 10.51 

9.100e-34 125-152 BL00349G 
19 77 S 781p-'?0 'W-ISfi 

445 

BL00154 

E1-E2 ATPases phosphorylation site 
proteins. 

BL00154F 8.23 8.941e-21 271- 
295 BL00 1 54E 20.37 2.620e- 1 5 
124-165 

448 

DM00215 

PROLINE-RICH PROTEIN 3. 

DM00215 19.43 4.882e-l 1 82-115 

451 

BL01283 

T-box domain proteins. 

BL01283A 24.15 3.100e-40 1 12- 
160 BL01283D 11.70 6.000e-39 

ZJJ-ZOO DLuIZoJD ZJ.l / O.JJOC- 

38 170-212 BL01283C 13.05 
7.750e- 19 222-236 

452 

PR00420 

AROMATIC-RING HYDROXYLASE 

(V] AVOPROTPIM 
\r LiJ\ V KJr IWJ 1 XHIN 

MONOOXYGENASE) SIGNATURE 

PR00420A 14.78 2.579e-l 1 3-26 

4ST 

PR OO \fO 
r IwU i OZ • 

KJiioJSJi ZrH-Zo oUdUJNII 

SIGNATURE 

JrKUUlOzrJ 1Z.// /.4Zye-I / Zl J- 

228 PR00162A9.35 2.324e-14 
iQ^-?ns PR0ftifi?r r in 7 i9ftp- 

17J-ZUJ rRvvIUZW O.lu /-1Z17C- 

14 227-240 

454 

PD01066 

PROTF1N 7TNC F1NGFR 7TNC- 
FINGER METAL-BINDING NU. 

X LJ\J l\J\J\j ly.'-rJ /.VUuC'JU O/ 

456 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 9.333e-18 1 149- 
1 192 

457 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 2.737e-24 16-55 

459 

Rl 00? QO 

Tm mi i noO"lr»r\it line nnn miairn* 
lilUJlUllUglUUUllllo aUU XllaJUI 

histocompatibility complex proteins. 

RT ftO?QftA 90 RQ 1 S7Qp 14 1 S4- 

177 BL00290B 13.17 9.000e- 12 
214-232 

460 

PR00413 

HALOACID 

DEHALOGENASE/EPOXIDE 
HYDROLASE FAMILY SIGNATURE 

PR00413F 14.91 7.333e-ll 193- 
214 PR00413E 1 S 78 S 714e-09 
175-192 

463 

PR00759 

BASIC PROTEASE flCUNITZ-TYPF^ 
INHIBITOR FAMILY SIGNATURE 

PR00759B 1 1 26 8 185e-09 74-85 

466 

BL00019 

Actinin-type actin-binding domain 
proteins. 

BL00019D 15.33 4.200e-19 300- 
330 

467 

BL00019 

Actinin-type actin-binding domain 

nrntp inc 

JJlUlCiilo. 

BL00019D 15.33 4.200e-19 300- 

469 

PR00153 

CYCLOPHILIN PEPTIDYL-PROLYL 
SIGNATURE 

PR00153D 11.99 3.250e-15 510- 

<v>1 PROfll^r 1 1 ftl 4 6R9p 14 

495-511 PR00153E9.10 8.548e- 
14 523-539 PR00153B 1 1.57 
1.720e-13 452-465 

470 

BL00491 

Aminopeptidase P and proline 
dipeptidase proteins. 

BL00491C 12.15 3.912e-09 557- 
572 

471 

PD00289 

PROTEIN SH3 DOMAIN REPEAT 

PD00289 9.97 1.000e-14 1482- 
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PRESYNA. 

1496 PD00289 9.97 8.650e-l 1 
1122-1136 

474 

BL50040 

Elongation factor 1 gamma chain profile. 

BL50040D 17.41 1 .000e-40 279- 

^90 RT ^0040P 1 51 70 1 OOn» /in 
^11 3RR RT S0040F 1R 00 <J ^90a 

40 390-428 BL50040C 22.62 
3.739e-38 141-184 BL50040B 
13.65 7.000e-30 59-85 BL50040A 
12.98 1.450e-14 10-22 

475 



RT 01 1 44 9S 07 1 000^-40 97 74 

\ 476 

PR00007 

COMPLEMENT C1Q DOMAIN 

PR00007C 15.60 2.421e-21 589- 

611 PR00007R 14 16 1 ^00#> 91 

544-564 PR00007A 19.33 6.897e- 
90 SI7-S44 PP00007H Q 64 

6.571e-12 623-634 

477 

BL50002 

Src homology 3 (SH3) domain proteins 
profile. 

BL50002A 14.19 5.846e-10 170- 
189 

479 

DM01970 

Okw ZK632.12 YDR313C 
ENDOSOMAL III. 

DM01970B 8.609.500e-17 967- 
980 

4R0 

PPOOR6R 

TV\T A T>r\J VA/1CD A QI7 tTAA/fTl V A /DHI 

IJIN A-rUL I McKAofc rAMIL Y A (rUL 

1 ) OlvJlNA 1 U IVC, 

DDAft9<QP 11 7/C ^ <oo n 1 TOO/I 

rKUUoooC 13. /O j.oooe-1 / Zo- 
loft PPAA86B A 1 £ 11 1 1 Q£« 1 1 

224-247 PR00868H 12.51 3.388e- 
n yi/ic ppahqaot in 07 

7.93 8e- 11 462-476 PR00868E 

11 10 1 608^-10^40 366 

481 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 9.182e-22 53-96 

482 

BL00061 

Short-chain dehydrogenases/reductases 
idm ny proiems. 

BL00061B 25.79 3.647e-21 188- 

996 ! 
ZZO 

483 

BL50002 

Src homology 3 (SH3) domain proteins 
profile. 

BL50002A 14.19 1.750e-12 1032- 
1051 

4RS 

rrUUUZj 

Ank repeat proteins. 

pcnnnoi a i £ m o toco i a t/ca 
rrUUUZJA io.UJ y.ozje-lu /oU- 

776 PF00023A 16.03 3.571e-09 

715-731 

486 

PD02870 

RECEPTOR INTERLEUKIN- 1 
PRECURSOR. 

PD02870B 18.83 9.262e-20 103- 
136 PD02870D 15.74 9.426e-09 
201-236 

487 

PR00370 

FLAVIN-CONTAINING 
MONOOXYGENASE (FMO) 

QTHWATIIPP 
olUJNAlUKJb 

PR00370G 10.45 3.769e-28 471- 
493 PR00370B 10.91 L000e-24 

77 a A DDnnnnr n to >i aaa^ 7i 
Z /-40 rKUUi /UC 1 Z. /Z 4.UUUe-z 1 

140-157 PR00370E 11.96 9.229e- 

91 390 HQ PP00170n 16 11 

1 7^0^-90 IR^-904 PP00370F 
i. / jwzu ioj"Z.uh jtivuvj iur 

Ml SI 395e-20 375-395 

PR00370A 3 35 2 038e-18 4-20 

489 

PD01675 

GLYCOPROTEIN MAJOR ENVELOPE 
PROBABLE U3. 

PD01675C 19.89 2.330e- 10 55-89 

492 

BL00211 

ABC transporters family proteins. 

BL0021 1A 12.23 5.050e-09 45-57 

493 

BL0021 1 

AR!"* tmnQr»orfprc familv nrr»f'f*inc 
ADv tiaiiopui LClD lalllllj piULCUlo. 

RT 009 1 1 A 1 9 93 *\ O^ftp-OQ 4S-S7 

494 

BL00211 

ABC transporters family proteins. 

BL00211A 12.23 5.050e-09 58-70 

495 

BL00027 

'Homeobox* domain proteins. 

BL00027 26.43 6.786e-12 509-552 
BL00027 26.43 9. 143e- 12 319-362 
BL00027 26.43 2.600e-l 1 627-670 

RT OAA97 9£ A1 1 &T\* 1 A 77A 099 

rSLrUUUZ/ Z0.4j j.OZje-lU / /y-ozz 

497 

BL00107 

i-> l—/ \J \J XV// 

Protpin lfinaQfxj ATP-hinHinc rpcri ah 

proteins. 

RT 001 07 A 1 R 3Q S RODp-99 914- 
245 BL00107B 13.31 1.000e-13 
281-297 BL00107A 18.39 3.520e- 
13 583-614 BL00107B 13.31 
8.615e-12 652-668 

499 

BL00383 

Tyrosine specific protein phosphatases 

BL00383E 10.35 1.000e-14 1902- 
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proteins. 

1913 BL00383D 11.92 3.077e-14 
1862-1875 BL00383A 13.34 
5.500e-14 1730-1745 BL00383C 
10.10 2.000e-13 1785-1796 
BL00383F 15.51 9.069e-12 1940- 
1956 BL00383B7.61 1.692e-ll 
1755-1764 

501 

PR00019 

LEUCINE-RICH REPEAT 
SIGNATURE 

PR00019B 11.36 1.360e-09136- 
150 PR00019A 11.19 L667e-09 
91-105 PR00019B 1 1.36 4.600e- 
09 160-174 

503 

BL00226 

Intermediate filaments proteins. 

BL00226D 19.10 1.000e-40 367- 
414 BL00226B23.86 6.143e-27 
195-243 BL00226A 12.77 7.840e- 
14 96-111 BL00226C 13.23 
2.600e- 13 309-340 BL00226C 
13.23 6.143e-12 266-297 
BL00226B 23.86 1.209e-09 146- 
194 


rU\JZ4\) f 

3-BISPHOSPHOGLYCERATE- 
INUbrbNDbN I PHOSPHOGLYCER. 

PD02407F 7.61 6.739e-09 916- 
930 

jUO 


rlbC I -domain (ubiquiun-transrerase). 

PF00632C 20.66 9.83 Oe- 19 991- 
1023 PF00632B 18.45 1.1 55e- 11 
940-968 

507 

BL01082 

Ribosomal protein L7Ae proteins. 

BL01082 20.37 4.273e-20 76-1 16 

508 

BL00678 

Tip- Asp (WD) repeat proteins proteins. 

BL00678 9.67 2.42Ie-09 493-504 

509 

BL00678 

Trp-Asp (WD) repeat proteins proteins. 

BL00678 9.67 2.421e-09 473-484 

510 

PR00320 

G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 

PR00320B 12.19 4.774e- 11 567- 
582 PR00320B 12.19 5.886e-10 
763-778 PR00320C 13.01 6.760e- 
10 567-582 PR00320A 16.74 
7.618e-10 846-861 PR00320A 
16.74 3.415e-09 763-778 
PR00320A 16.74 6.268e-09 567- 
582 


RJ AA/170 

Phorbol esters / diacylglycerol binding 
domain proteins. 

BL00479C 12.01 3.250e-12 170- 
183 

SI? 

DT SftfiSft 
DLJWJO 

G-protein gamma subunit profile. 

ni c A A c o n oi t ac\a~ t\c\ ia c o 

dLjOODo 11.15 7.494e-09 10-58 

513 

BL00524 

Somatomedin B domain proteins. 

BL00524A 9.65 8.925e-14 80-101 

sis 

.r>iAfUU4 1 

Bacterial regulatory proteins, araC family 
proteins. 

BL00041 23.99 1.964e- 19 492-524 

J 10 

pr>non/;/; 

rKUxIilN £1NC-MNGER MblAL- 
BINDI. 

PD00066 13.92 8.500e-13 391-404 

S17 

pi nn/i i s 

Synapsins proteins. 

BL00415E 4.82 9;291e-09 959- 
996 

518 

PR00109 

TYROSINE KINASE CATALYTIC 

T"\/"Ml X A TXT CT/^XT A TT TT) T? 

JJUMA1N MUJN A 1 UKb 

PR00109B 12.27 9.471e-12 126- 

1 A C 

145 

519 

BL00290 

Immunoglobulins and major 
histocompatibility complex proteins. 

BL00290B 13.17 4.750e-09 47-65 

522 

PR00505 

D12 CLASS N6 ADENINE-SPECIFIC 
DNA METHYLTRANSFERASE 
SIGNATURE 

PR00505A 14.15 7.128e-09364- 
381 

525 

BL00312 

Glycophorin A proteins. 

BL00312B 9.22 5.781e-10 891- 

528 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 2.500e-32 16-55 

529 

PR00254 

NICOTINIC ACETYLCHOLINE 
RECEPTOR SIGNATURE 

PR00254D 15.50 4.000e- 17 131- 
150 PR00254A 1 1.23 4.706e-14 
61-78 PR00254C11.36 4.000e-12 
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113-126 PR00254B 12.97 1.486e- 
11 95-110 

531 

BL00741 

Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 

BL00741B 14.27 6.870e~16 787- 
810 

532 

r>T> nrv i c\i 

MYOSIN HbAVY CHAIN 
SIGNATURE 

PPOOIQir* 1 yf 1 1 Alt* 1A AA1 

476 PR00193C 12.60 7.632e-32 

01 < 744 PPOOl Q1P 1 1 60 1 T\C\t> 

29 167-193 PR00193A 15.41 

7 1 11-11 1 PR001Q1F 
1Q47??0Oe-?l S01-S30 

533 

PD02870 

RECEPTOR INTERLEUKIN- 1 

r Jt\JC> uU Xvo VJlv. 

PD02870B 18.83 5.596e-09 348- 

JO 1 

535 

PR00683 

SPECTRIN PLECKSTRIN 
HOMOLOGY DOMAIN SIGNATURE 

PR00683D 15.87 2.452e-10 465- 
484 

536 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 6.684e-24 164-207 

538 

PR00239 

MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 

PR00239E 1 .58 2.739e-09 225- 
237 

539 

BL00406 

Actins proteins. 

tjt nA/iA/:n (L ^c i AAA- AC\ 1^7 
r>LUU4UoC O. / j I .UUUe-4U id f- 

212 BL00406B 5.47 6.143e-37 
oo i ak pt oo/ioaf* n /l /;no^ 

36 291-346 BL00406E8.44 

7 700o 11 1A4 414 PT 00406 A 

9.95 4.441e-23 7-42 

540 

DD f\f\A C£L 

rK004jO 

KiDUoUMAL rKU l Hiin rz 
SIGNATURE 

PPOfi/4 1 OA O TO /l/l *\Q 

C A 1 

541 

nn f\f\A c/z 
rR0045o 

DTDAOAN/f AT T>T> fYTTJTXT T>7 

KIBOaUMAL JrKU 1 rslJN rz 
SIGNATURE 

PPOA4^£P 1 O/; Q 67<\*» IO 44 -\0 

542 

PF00023 

Ank repeat proteins. 

rrulHJZJA 10. /.oj/e-ll Ijo- 

154 

544 

PF00642 

Zinc finger C-x8-C-x5-C-x3-H type (and 
similar). 

PF00642 1 1.59 9.082e-10 838-849 

546 

BL00383 

Tyrosine specific protein phosphatases 
proteins. 

BL00383E 10.35 4.1 15e-10 104- 
115 

547 

BL01226 

Hydroxymethylglutaryl-coenzyme A 
synthase proteins. 

BL01226A 13.79 1.000e-40 50-89 
BL01226C 13.51 1.000e-40 127- 
167 BL01226D 11.60 1.000e-40 
174-210 BL01226E 13.74 l.OOOe- 
40 212-253 BL01226H 17.74 

1 AAAa A A IO/: /tl/f PT A177/£T 

1 .UUUe-4U J 50-434 dLUIzzoI 
25.06 1.000e-40 460-508 

PT 0177/;n 1 -\ 7£ 1 4R1*» 17 709- 

JdIafizzou 1 j./o j.-+oje-jz zyz- 
321 BL01226B 13.35 1.818e-31 
95-127 BL01226F9.78 8.714e-23 
253-271 

549 

T>T AflQ/C/f 

JdL0U9o4 

Syndecans proteins. 

RT OOQf^P 17 0*\ 7 476p 10 1746- 
ijL.UU!7t>*tD 1Z.UJ Z.HZOC-1U iz*+u- 

17RQ 

551 

DM01930 

2 kw FINGER SMCX SMCY 
i UKuyo w. 

DM01930E 15.41 1.367e-37 170- 

71^ HMOIQIOF 14 16 R 7T?p-?R 

267-303 DM01930B 19.86 
9.163e-10 37-71 

552 

BL00195 

Glutaredoxin proteins. 

BL00195B 15.31 7.158e-09 9-29 

CCA 

554 

BL00383 

Tyrosine specific protein phosphatases 
proteins. 

447 


PP00401 

WWTVYNAATTsJ QTfiTsJ A TTTRP 
W W J^w1VJL/\JJlN OivJlN/V I UrvQ 

PR00401R 17 1Q7fil7e-ll 1??- 

137 PR00403A 16.82 3.9 12e- 10 
107-121 PR00403B 12.19 2.068e- 
09 76-91 

558 

PR00380 

KINESIN HEAVY CHAIN 
SIGNATURE 

PR00380A 14.18 2.7 14e-26 76-98 
PR00380D 9.93 3.000e-24 275- 
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297 PR00380C 13.18 5.1 54e-20 
226-245 PR00380B 12.64 9.400e- 
20 195-213 

559 . 

BL00518 

Zinc finger, C3HC4 type (RING finger), 
proteins. 

BL00518 12.23 5.333e-09 522-531 

JOi 

FD0I795 

PROTEIN AMINOPEPTIDASE 
PRECURSOR HYDROLASE SIGNA. 

PD01795B 1 1.56 2.333e-I2 159- 
172 PD01795A 10.27 1.000e-09 
135-144 

562 

PD01795 

PROTEIN AMINOPEPTIDASE 

TYT> T?/^T TT> O /^VT> TT\/T\T) AT ACT"* CTr'XI A 

PRECURSOR HYDROLASE SIGNA. 

PD01795B 11.56 2.333e-12 110- 
123 PD01795A 10.27 l.000e-09 
86-95 

JO J 

r>T AAA 1 o 

EF-hand calcium-binding domain 
proteins. 

BL00018 7.41 1.391e-0941-54 1 

565 

BL00348 

p53 tumor antigen proteins. 

BL00348F 23.194.143e-09 188- 
231 

567 

PD00301 

PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 

PD00301B 5.49 4.1 15e-09 284- 
295 

569 

PF00850 

Histone deacetylase family. 

PF00850E 8.88 6.553e-21 756-782 
PF00850D 14.76 L519e-16 722- 
746 PF00850F 15.70 1.118e-ll 
794-827 PF0O850G 22.75 8.375e- 
11 833-875 

570 

PD00289 

PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 

PD00289 9.97 4.960e-10 137-151 

571 

BL00518 

Zinc finger, C3HC4 type (RING finger), 
proteins. 

BL00518 12.23 8.800e-ll 44-53 

573 

BL00299 

Ubiquitin domain proteins. 

BL00299 28.84 1.1 23e- 11 123-175 

574 

PF01140 

Matrix protein (MA), pi 5. 

PF01 MOD 15.54 3.700e-10 986- 
1021 

576 

BL00284 

Serpins proteins. 

BL00284C 28.56 5.200e-26 200- 
242 BL00284A 15.64 4.913e-18 
71-95 BL00284B 17.99 7.261e-15 
173-194 BL00284D 16.34 5.846e- 
13 306-333 BL00284E 19.15 
7.429e-12 387-412 

579 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 6.553e-29 15-54 

580 

BL50001 

Src homology 2 (SH2) domain proteins 
profile. 

BL50001B 17.40 4.500e-l2 1010- 
1031 

581 

. PD00930 

PROTEIN GTPASE DOMAIN 
ACTIVATION. 

PD00930B 33.72 3.189e-22 608- 
649 PD00930A 25.62 6.806e-17 
505-53 1 


rt nrwiio 

DlA/UOlZ 

Osteonectin domain proteins. 

DI rtft/CIOD 11 O C I AT .<l ~ 11 A1 

BL00612B 1 1.35 2.034e-ll 93- 
126 

585 

DM01551 

kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 

DM01551C 14.62 8.859e-10 102- 
122 

586 

PF00628 

PHD-finger. 

PF00628 15.84 3.455e-12 235-250 

55/ 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 6.063e-10 85-128 

588 

PR00326 

GTP1/OBG GTP-BINDING PROTEIN 
FAMILY SIGNATURE 

PR00326A 8.75 7.525e-16227- 
248 PR00326C9.79 6.760e-15 
276-292 PR00326D 19.09 6.657e- 
13 293-312 PR00326B 16.74 
9.229e-13 248-267 

JO? 

RT Oftzl?9 

Gran ins proteins. 

2378 

590 

BL00415 

Synapsins proteins. 

BL004 1 5N 4.29 9.794e- 10 295- 
339 

591 

BL00128 

Alpha-lactalbumin / lysozyme C proteins. 

EIL00128A 20.76 3.423e-13 35-65 
BL00128C 19.34 2.980e- 11 110- 
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132 

596 

PR00049 

WILM'S TUMOUR PROTEIN 
SIGNATURE 

PR00049D 0.00 3.136e-09 31-46 

597 

DM00547 

1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 

DM00547C 17.30 1.667e- 19 207- 
229 DM00547E 13.94 6.200e- 18 
319-342 DM00547B 11 28 
L000e-1 7 179-193 DM00547D 
1 1 .60 9.250e- 13 289-303 
DM00547F 23.43 6.727e-12 679- 
726 DM00547A 12.38 4.818e-l 1 
158-170 

600 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 L882e-27 13-52 

601 

BL00192 

Cytochrome b/b6 heme-ligand proteins. 

BLO0 1 92A 1 1 .90 6.400e-09 390- 
430 

602 

BL00936 

Ribosomal protein L35 proteins. 

BL00936B 27.27 8.615e-09 1 1 8- 
157 

603 

BL00936 

Ribosomal protein L35 proteins. 

BLO0936B 27.27 8.615e-09 118- 
157 

606 

PR00019 

LEUCINE-RICH REPEAT 
SIGNATURE 

PR00019B 1 1 36 7 300e-10292- 
306 PR00019A 11.19 5.667e-09 
323-337 

607 

PR00019 

LEUCINE-RICH REPEAT 
SIGNATURE 

PR00019B 11.36 7.300e-10 292- 
306 PR00019A 11.19 5.667e-09 
323-337 

608 

PR00320 

G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 

PR00320C 13.01 9.500e-12 168- 
183 PR00320A 16.74 2.853e-10 

14-29 PR00320C 13.01 5.320e-10 
60-75 PR00320C 13.01 5.680e-10 
14-29 PR00320A 16.74 6.049e-09 
217-232 PR00320B 12.19 8.875e- 
09 168-183 

610 

BL00750 

Chaperonins TCP- 1 proteins. 

BL00750B 16.17 1.000e-40 70- 
120 BL00750A 20.07 6.21 1 e-37 
26-69 BL00750G20.12 8.800e-31 
431-471 BL00750F 18.40 5.125e- 
30 370-41 1 BL0075OF 24 59 
8.650e-29 295-332 BL00750H 
21.44 1.000e-27 489-524 
BLO075OC 25.65 5.345e-17 149- 
181 BL00750D 16.16 6.3 18e- 14 
203-222 

613 

BL00766 

Tetrahvdrofolate « 

A V^U till Y VIA ViviUL^ 1 

dehydrogenase/cyciohydrolase proteins. 

BL00766B 24 49 1 000e-40 142- 
190 BL00766E 13.78 1.000e-40 
322-359 BL00766C 25.86 5.500e- 
39 208-256 BL00766D 17.05 
4.536e-26 283-313 BL00766A 
21.48 6.063e-24 102-132 

615 

BL00256 

Adipokinetic hormone family proteins. 

BL00256 12.28 3.298e-10 746-755 

616 

BL00319 

Amyloidogenic glycoprotein extracellular 
domain proteins. 

BL00319C 17.12 9.053e-09 419- 
453 

617 

BL00030 

Eukaryotic RNA-binding region RNP-1 
proteins. 

BL00030A 14.39 4.429e-09 44-63 

^618 

BL00030 

Eukaryotic RNA-binding region RNP-1 
proteins. 

BL00030A 14.39 4.429e-09 44-63 

620 

BL00325 

Actin-depolymerizing proteins. 

BL00325B 21.66 5.817e-16 77- 
123 

622 

BL00972 

Ubiquitin carboxyl-terminal hydrolases 

BL00972A 11.93 5.500e-19 213- 
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family 2 proteins. 

231 BL00972D 22.55 2.742e- 16 
501-526 BL00972B9.45 l.OOOe- 
1 1 297-307 BL00972C 16.48 
3.160e-ll 370-385 BL00972E 
20.72 7.5 17e-l 0 526-548 

625 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NIL 

PD01066 19.43 6.333e-39 6-45 

628 

BL00039 

DEAD-box subfamily ATP- dependent 
helicases proteins. 

BL00039D 21.67 7.750e-31 478- 
524 BL00039A 18.44 2.000e-25 
198-237 BL00039C 15.63 L844e- 
15 327-351 BL00039B 19.19 
5.636e-14 242-268 

630 

PD00306 

PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 

PD00306A 10.26 7.000e-12 232- 
246 

631 

PD00306 

PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 

PD00306A 10.26 7.000e-12 290- 
304 

633 

BL00785 

5-nucleotidase proteins. 

BL00785C 9.45 3.625e-16 108- 
122 BL00785E 15.85 4.000e-16 
279-295 BL00785A 9.73 6.500e- 
14 29-40 BL00785B 10.65 
5.500e-13 72-86 BL00785D 9.89 
4.000e-12 135-145 

636 

PR00832 

PAXILLIN SIGNATURE 

PR00832E 14.43 9.901e-14 85- 
108 

637 

PR00109 

TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 

PR00109B 12.27 6.362e-13 221- 
240 

638 

PF00635 

MSP (Major sperm protein) domain 
proteins. 

PF00635B 15.84 4.900e-l 1 463- 
502 

639 

PR00860 

VERTEBRATE METALLOTHIONEIN 
SIGNATURE 

PR00860B 7.04 1.900e-lS 85-99 
PR00860C9.61 1.474e- 14 99-109 
PR00860A 5.46 1.720e- 14 63-76 

641 

PD00066 

PROTEIN ZINC-FINGER METAL- 
BINDI. 

PD00066 13.92 4.462e-15 271 -284 
PD00066 13.92 4.462e-15 299-312 
PD00066 13.92 2.800e- 14 327-340 
PD00066 13.92 2.800e- 14 383-396 
PD00066 13.92 2.800e-14 41 1-424 
PD00066 13.92 7.000e- 14 355-368 
PD00066 13.92 8.800e-14 439-452 
PD00066 13.92 8.800e-14 495-508 
PD00066 13.92 1.500e-13 551-564 
PD00066 13.92 7.000e-13 467-480 
PD00066 13.92 7.000e-13 523-536 
PD00066 13.92 9.500e-13 215-228 
PD00066 13.92 9.500e-13 243-256 
PD00066 13.92 9.500e- 13 579-592 
PD00066 13.92 8.615e-10 607-620 
PD00066 13.92 1.600e-09 187-200 

642 

BL00961 

Ribosomal protein S28e proteins. 

BL00961B 11.24 7.429e-37 67- 
100 BL00961A9.90 4.079e-26 
42-66 

643 

BL00585 

Ribosomal protein S5 proteins. 

BL00585A 28.43 1.391e-40 103- 
155 BL00585B 18.78 3.250e-30 
193-230 

ti An 
04/ 

BLU0678 

Trp-Asp (WD) repeat proteins proteins. 

T->y (\r\/:nQ o HI O A Aflo in 191 1 CD 

648 

PR00876 

NEMATODE METALLOTHIONEIN 
SIGNATURE 

PR00876C 6.15 9.229e-09 112- 
126 

652 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 5.94 le-27 29-68 

653 

BL00047 

Histone H4 proteins. 

BL00047A 13.53 L000e-402-41 
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BL00047B6.5I I.429e-40 41-74 
BL00047C 12.18 L310e-38 74- 
104 

654 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 4.109e-25 30-69 

655 

BL01115 

G TP- binding nuclear protein ran proteins. 

BL01115A 10.22 3.483e-17 19-63~ 

657 

BL00518 

Zinc finger, C3HC4 type (RING finger), 
proteins. 

BL00518 12.23 8.286e-10 31-40 

658 

BL00125 

Serine/threonine specific protein 
phosphatases proteins. 

BL00125B 21.48 1.000e-40 89- 
135 BL00125C 19.97 1.000e-40 
153-200 BL00125D33.il l.OOOe- 
40 213-268 BL00125A 14.83 
8.941e-38 47-84 

659 

PD00066 

PROTEIN ZINC-FINGER METAL- 
BINDI. 

PD00066 13.92 8.200e-16 492-505 
PD00066 13.92 9.308e- 15 380-393 
PD00066 13.92 6.000e-13 352-365 
PD00066 13.92 7.000e-13 240-253 
PD00066 13.92 7.500e-13 268-281 
PD00066 13.92 7.500e-l 3 408-421 
PD00066 13.92 2.174e-ll 464-477 
PD00066 13.92 l.OOOe- 10 43 6-449 

660 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 2.189e-26 29-68 

661 

BL00795 

Involucrin proteins. 

BL00795C 17.06 7.882e-15 193- 
238 BL00795C 17.06 3.797e-13 
187-232 BL00795C 17.06 5.0 14e- 
13 188-233 BL00795C 17.06 
4.506e-12 196-241 BL00795C 
1 7.06 7.896e- 12 191-236 
BL00795C 17.06 L667e-ll 185- 

230 BL00795C 17.06 2.000e- 11 
198-243 BL00795C 17.06 3.778e- 
11 171-216 BL00795C 17.06 
6.111e-ll 197-242 BL00795C 
17.06 6.444e- 11 194-239 
BL00795C 17.06 8.000e- 11 189- 
234 BL00795C 17.06 8.556e-ll 
192-237 BL00795C 17.06 1.733e- 
10 195-240 BL00795C 17.06 
2.779e-l 0 184-229 BL00795C 
17.06 4.035e-10 199-244 
BL00795C 17.06 5.081e-10 186- 

231 BL00795C 17.06 6.965e-10 
190-235 BL00795C 17.06 2.700e- 
09200-245 BL00795C 37.06 
5.800e-09 175-220 BL00795C 
17.06 6.500e-09 182-227 
BL00795C 17.06 6.600e-09 201- 
246 BL00795C 17.06 6.600e-09 
202-247 BL00795C 17.06 6.600e- 
09 208-253 

662 

BL00469 

Nucleoside diphosphate kinases proteins. 

BL00469 22.22 1.000e-40 149-204 

663 

BL01160 

Kinesin light chain repeat proteins. 

BL01160B 19.54 9.41 le-1 1 331- 
385 

664 

BL00601 

Tryptophan pentad repeat proteins (IRF 
family) proteins. 

BL00601A 20.29 5.500e-23 7-46 
BL00601B 20.92 3.631e-13 69-98 

665 

BL00082 

Extradiol ring-cleavage dioxygenases 
proteins. 

BL00082A 19.07 8. 6 15e- 12 49-72 

666 

DM01537 

kw SKI2W SKI2 NUCLEOLAR 

DM01537B 21.63 4.073e-37 834- 
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HEL1CASE. 

881 DM01537B 21.63 9.750e-21 
1669-1716 DM01537A 15.14 
8.650e-18 698-718 DM0 1537 A 
15.14 6.766e-12 1537-1557 

667 

DM01537 

kw SKJ2W SKI2 NUCLEOLAR 
HELICASE. 

DM01537B 21.63 7.923e-38 820- 
867 DM01537B 21.63 9.750e-21 
1655-1702 DM01 537A 15.14 
8.650e-18 684-704 DM01 537A 
1 D.14 o./ooe-lz iDzi-lMJ 

669 

BL00107 

Protein kinases ATP-binding region 
proteins. 

BL00107A 18.39 6.786e-24 849- 
880 BL00107B 13.31 6.727e-13 
9 lo-9Jz 

670 

BL00299 

Ubiquitin domain proteins. 

BL00299 28.84>9.735e-27 37-89 

671 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 6.571e-12 432-475 

676 

PR00861 

ALPHA-LYTIC ENDOPEPTIDASE 
SERINE PROTEASE (S2A) 
SIGNATURE 

PR00861E 9.88 2.385e-09 206- 
221 

678 

BL00225 

Crystal I ins beta and gamma 'Greek key* 
motif proteins. 

BL00225B 18.06 7.517e-24 1805- 
1840 BL00225B 18.06 8.297e-20 
1987-2022 BL00225B 18.06 
2.575e-19 1896-1931 BL00225B 
18.06 8.200e-19 175-210 
BL00225B 18.06 8.200e-19 1698- 
1733 BL00225B 18.06 4.80Se- 14 
73-108 BL00225B 18.06 4.808e- 
14 1596-1631 BL00225B 18.06 
5.500e-14 2077-21 12 BL00225A 
13.82 5.829e-12 2043-2064 
BL00225A 13.82 3.127e-09 1759- 
1780 

679 

PR00320 

G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 

PR00320C 13.01 4.240e-10 169- 
184 PR00320A 16.74 6.294e-10 
169-184 

680 

BL00243 

Integrins beta chain cysteine-rich domain 
proteins. 

BL00243I 31.77 I.143e-ll 172- 
215 

681 

PR00852 

XERODERMA PIGMENTOSUM 
GROUP D PROTEIN SIGNATURE 

PR00852H 5.90 1.000e-29 612- 
635 PR00852E8.14 3.769e-27 
348-371 PR00852D 1 1.38 8.875e- 
27 309-331 PR00852B 11.08 
2.800e-25 249-269 PR00852I 
17.26 3.500e-25 683-704 
PR00852F 1 1.85 5.909e-24 379- 
398 PR00852G 16.19 4.462e-23 
468-486 PR00852C8.81 9.143e- 
23 284-303 

682 

BL50058 

G-protein gamma subunit profile. 

BL50058 27.23 1.375e-35 15-63 

685 

BL00972 

Ubiquitin carboxyl-terminal hydrolases 
family 2 proteins. 

BL00972A 11.93 7.500e-20 40-58 
BL00972D 22.55 3.903e-16 300- 
325 BL00972B9.45 1.000e-I3 
120-130 BL00972E 20.72 5.500e- 
1 1 325-347 

687 

BL00237 

G-protein coupled receptors proteins. 

BL00237A 27.68 4.273e-14 98- 
138 

coo 
688 

BL00388 

Proteasome A -type subunits proteins. 

ot nmoo a to i a i nnn« A A O ca 

BLOUJooA 23.14 1.000e-40 o-54 
BL00388B 31.38 3.864e-33 66- 
108 BL00388D 20.71 1.000e-21 
153-184 BL00388C 18.79 8.147e- 
16 126-148 

689 

PD02796 

PROTEIN STEROL CARRIER LIPID- 

PD02796B 20.92 1.105e-15 347- 
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TRAN. 

394 

691 

PD01572 

PHOTOSYSTEM II REACTION 
CENTRE T PROTEIN PHOTOS. 

PD01572 8.77 4.083e-09 1-31 

692 

BL00028 

Zinc finder C2H2 tvne domain nrotein*; 

BL00028 16 07 7 600e-IO 4X8-50S 

694 

BL01013 

Oxysterol-binding protein family 
proteins. 

BL01013A 25.14 9.357e-33 527- 
563 BL01013D 26.81 8.235e-23 
814-858 BL01013C 9.97 6.21 le- 
14 615-625 BL01013B 1 1 33 
3.605e-13 592-603 

695 

PD00289 

PROTEIN SH3 DOMAIN REPEAT 
PRFSYNA 

PD00289 9.97 3.571e-13 164-178 
PD009R9 9 97 J? 650e-l 1 7147- 
2161 PD00289 9.97 2.552e-09 23- 
37 

698 

PR00161 

NICKEL-DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 

PR00161C 9.51 4.930e-09 282- 
302 

700 

PR0074Q 

T Y907YMF n ^THWATTTRF 

Li I £>\J£j I 1VLC VJ OlvJlN/\ 1 U±Vl> 

PRfifl74QF 1^ K\ R I^Q 

156 PR00749H8.22 3.681e-12 
173-194 PR00749B 16.54 1.419e- 
1 1 48-70 PR00749C 7.26 3.060e- 
1172-91 PR00749A 10.33 
4.815e-10 24-45 

703 

PR00704 

CALPAIN CYSTEINE PROTEASE (C2) 
FAMILY SIGNATURE 

PR00704I 9.52 1.000e-29 476-505 
PR00704D i 1.05 2.500e-27 132- 
158 PR00704E 12.55 5.500e-27 
162-186 PR00704F 13.61 l.OOOe- 
22 187-215 PR00704G 13.87 

1 ')'17o 01 ^17 11Q PPOATfl/IUf 
I.Zj/e-Zl j1 /-JJ7 rlxUu/U4ri 

13.38 8.138e-21 367-385 

PPf)A704A 14 6R9 19Sr 1Q97 <\ 1 

PR00704C 11.88 1.257e-17 96- 
113 PR00704B 17.94 1.833e-15 
72-95 

705 

PR00859 

PROKARYOTE METALLOTHIONEIN 

STfiTvJATT fRF 

ol\JlN/\ 1 U JSJZj 

PR00859C 7.06 2.776e-09 94-1 1 1 

706 

BL00226 

Intermediate filaments proteins. 

BL00226D 19.10 9.581e-26 369- 

416 RT 00996R 9^ Rfi ^ 9Sflf»-94 

203-251 BL00226C 13.23 8.269e- 
21 268-299 BL00226A 12 77 
8.200e-14 103-118 

707 

PR00021 

SMALL PROLINE-RJCH PROTEIN 
SIGNATURE 

PR00021A4.31 2.440e-102-15 

708 

BL00361 

Ribosomal protein S10 proteins. 

BL00361B 18.34 5.101e-10 82- 
105 

709 

PR00021 

SMALL PROLINE-R1CH PROTEIN 
SIGNATURE 

PR00021A4.31 2.200e-102-15 

710 

BL00514 

FiKrinn Cf*n HiAtu anH crammzi fhainc C*— 
i lui luugCll ucia culU Eo tlilliH i>liailla V> 

terminal domain proteins. 

RT 00^14P 17 41 R419**-97 Ififl- 

197 BL00514E 14.28 8.909e-16 
219-236 BT 00S14H 14 QS 1 S51e- 
15 317-342 BL00514G 15.98 
7.750e-15 284-314 BL00514D 
15.35 4.789e-10 201-214 

711 

PD00930 

PROTFTM OTP A <sF nOMATTsJ 
ACTIVATION. 

PDOOQ^OR T\ 79 R 714p-19 40-QO 

714 

BL00400 

LBP / BPI / CETP family proteins. 

BL00400C 24.53 6.029e-17 158- 
202 BL00400D 23.26 2.080e-14 
222-259 BL00400A 21.59 1.600e- 
10 27-59 

715 

BL01154 

RNA polymerases L / 13 to 16 Kd 

BL01154B 24.55 5.500e-36 40-76 
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subunits proteins. 

bLUi 1j4A lis. /0 J.(JO0e-22 19-40 

716 

PD01066 

PROTEIN ZINC FINGER ZINC- 

rl[NOn.K MJb I AL-dI N UlINO NU. 

PD01066 19.43 9.786e-32 10-49 

717 

BL00215 

Mitochondrial energy transfer proteins. 

BL00215A 15.82 9.206e-14 77- 
102 BL00215A 15.82 8.412e-10 
175-200 

719 

BL00309 

Vertebrate galactoside-binding lectin 
proteins. 

BL00309C 18.65 2.24 le-09 62-87 

/zo 

BLOUoo / 

Aldehyde dehydrogenases glutamic acid 
proteins. 

BL00687E 25.37 7.136e-33 266- 
316 BL00687D 26.00 5.333e-28 
151-198 BL00687B 17.543.647e- 
2o i9-o 1 BL00687C 24. 1 3 
6.087e-22 96-133 BL00687F 9.55 
Z.jUUe-1 1 3->2-3o3 

727 

DM01354 

kw TRANSCRIPTASE REVERSE II 
ORF2. 

DM01354N 13.17 1.000e-40 129- 
174 DM01354O8.73 6.605e-15 
180-226 

734 

PD00301 

PROTEIN REPEAT MUSCLE 
CALCIUM-BI. 

PD00301A 10.24 6.400e-09 101- 
112 

735 

BL01024 

Protein phosphatase 2A regulatory 
subunit PR55 proteins. 

BL01024A 10.26 1.000e-40 22-69 
BL0I024B8.91 L000e-40 86-127 
BL01024C 7.80 1.000e-40 146- 
185 BL01024D 13.22 1.000e-40 
185-222 BL01024E 1 L96 l.OOOe- 
40 222-266 BL01024F9.42 

I. 000e-40 266-3 17 BL01024G 

II. 09 1.000e-40 3 17-349 
BL01024H 13.88 1.000e-40 389- 
442 

/JO 

DTAAfl 1 O 

Trypanosome variant surface 
glycoprotein. 

PF00913D 1 1.90 7.130e-10 24-51 

737 

PR00700 

PROTEIN TYROSINE PHOSPHATASE 

CI/^XT A TT JT> I? 

PR00700D 12.47 2.200e-09 82^ 

1 A 1 

101 

740 

PR00320 

G-PROTEIN BETA WD-40 REPEAT 

CT/^TvT A TT TO IT 

oiuNA 1 UKb 

PR00320C 13.01 1.600e-09 68-83 

r>T"» AAnn a i ^ i a t t zr /r ~ a a /to o 

FR00320A 16.74 7.366e-09 68-83 

743 

PR00871 

DNA 

NUCLEOTIDYLEXOTRANSFERASE 
(TDT) SIGNATURE 

PR00871G 14.48 8.000e-09 178- 
201 

745 

BL00518 

Zinc finger, C3HC4 type (RING finger), 
proteins. 

BL00518 12.23 2.286e-l 033-42 

749 

BL00215 

Mitochondrial energy transfer proteins. 

BL00215A 15.82 5.200e- 15 221- 
246 BL00215A 15.82 7.618e-14 
20-45 BL00215A 15.82 8.851e-ll 
123-148 BL00215B 10.44 9.526e- 

I 1 H(\ OO "DT A AO 1 CT"J 1 A A A 

II 69-82 BL00215B 10.44 
7.300e-09 272-285 BL00215B 

1 A A A Q CAAa AA 1 CC 1 TO 

10.44 5.jUUe-09 165-178 

751 

BL50002 

Src homology 3 (SH3) domain proteins 
profile. 

BL50002A 14.19 1.000e-14 370- 

n OA T>T CAAATD 1 C 1 O 1 OAA- 1 /\ 

389 BL5U0U2B 15.18 2.200e- 10 

752 

BL00353 

HMG 1/2 proteins. 

BL00353B 1 1 .47 3.089e-12 390- 
440 

753 

PF00622 

Domain in SPia and the RYanodine 

T? prpntnr 

PF00622B 21.00 4.214e-14 47-69 

754 

BL00211 

ABC transporters family proteins. 

BL00211A 12.23 8.94 le-1 0 66-78 

755 

PR00926 

MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 

PR00926F 17.75 7.750e-19 392- 
415 PR00926C 16.07 5.935e- 17 
253-274 PR00926D 10.53 2.059e- 
15 301-320 PR00926E 11.70 
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4.971e-15 344-363 PR00926B 
16.07 9.526e-13 210-225 
PR00926A 10.41 1.514e-12 197- 
211 

756 

BL01187 

Calcium-binding EGF-like domain 
proteins pattern proteins. 

BL01 187A 9.98 2.125e-12 324- 
336 BL01187A9.98 4.789e-ll 
377-389 BL01187B 12.04 3.05 7e- 
10 439-455 

757 

PF00651 

BTB (also known as BR-C/Ttk) domain 
proteins. 

PF00651 15.00 4.429e- 10 43-56 

758 

PR00055 

HIV TAT DOMAIN SIGNATURE 

PR00055A 8.13 8.855e-09 144- 
156 

759 

PD00066 

PROTEIN ZINC-FINGER METAL- 
BINDI. 

PD00066 13.92 5.304e-ll 110-123 

760 

PR00448 

NSF ATTACHMENT PROTEIN 
SIGNATURE 0 

PR00448D 12.42 3.455e-27 162- 
186 PR00448A 10.74 1.273e-22 
37-57 PR00448B 16.01 9.379e-21 
100-118 PR00448C 11.46 l.OOOe- 
20 129-147 

765 

BL01042 

Homoserine dehydrogenase proteins. 

BL01042A 13.29 5.909e-ll 74-95 

766 

PR00625 

DNAJ PROTEIN FAMILY 
SIGNATURE 

PR00625A 12.84 2.154e-18 26-46 
PR00625B 13.48 9.000e-16 57-78 

768 

BL00762 

WHEP-TRS domain proteins. 

BL00762A 23.43 8.500e-28 1 12- 
149 BL00762B 16.14 3.793e-12 
64-78 BL00762A 23.43 6.625e- 12 
6-43 BL00762C 15.58 4.176e-09 
459-472 BL00762D 11.15 9.667e- 
09 210-220 

769 

PR00709 

AVIDIN SIGNATURE 

PR00709A4.60 1.934e-09 1-20 

770 

PR00320 

G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 

PR00320C 13.01 1.720e-10 262- 
277 PR00320A 16.74 2.853e-10 
262-277 PR00320C 13.01 4.300e- 
09 96-111 PR00320B 12.19 
5.500e-09 262-277 PR00320A 
16.74 6.268e-09 55-70 

771 

PR00019 

LEUCINE-RICH REPEAT 
SIGNATURE 

PR00019B 11.36 8.714e-12 87- 
101 PR00019A 11.19 1.000e-10 
90-104 

772 

PD02807 

APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 

PD02807C 8.91 6.308e-10 110- 
159 

773 

PD02807 

APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 

PD02807C 8.91 6.308e-10 155- 
204 

774 

DM00547 

1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 

DM00547F 23.43 3.942e-28 943- 
990 DM00547E 13.94 9.750e-21 
652-675 DM00547B 11.28 
1.818e-18 518-532 DM00547C 
17.30 3. 53 le- 17 546-568 
DM00547A 12.38 L273e-ll 497- 
509 DM00547D 1 1 .60 9.200e-l 1 
622-636 

776 

PR00779 

INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 

PR00779F 14.51 5.147e-09 769- 
792 

111 

PR00779 

INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 

nn AATIAP t A CI C 1 AH ^ Aft 7/10 

PR00779F 14.51 5.147e-09 /4z- 
765 

IIS 

PR00779 

INOSITOL 1,4,5-TRISPHOSPHATE- 
BINDING PROTEIN RECEPTOR 
SIGNATURE 

PR00779F 14.51 5.147e-09 742- 
765 
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779 

BL01282 

BIR repeat proteins. 

BL01282B 30.49 2.543e-09 6-45 

781 

PR00205 

CADHERJN SIGNATURE 

PR00205B 11 .39 3. 1 1 8e- 1 1 654- 
672 PR00205B 1 1.39 8.588e-ll 
230-248 PR00205B 1 1.39 8.527e- 
10 551-569 PR00205B 11.39 
4.203e-09 336-354 

783 

BL00625 

Regulator of chromosome condensation 
(RCC1) proteins. 

BL00625B 17.69 2.l67e-19 193- 
227 BL00625A 16.21 5.500e-17 
199-228 BL00625B 17.69 1.885e- 
16 140-174 BL00625B 17.69 
2. 770e- 16 245-279 BL00625A 
16.21 9. 115e-16 251-280 
BL00625A 16.21 6.507e-14 146- 
175 

785 

PF00084 

Sushi domain proteins (SCR repeat 
proteins. 

PF00084B 9.45 7.188e-10 595-607 
PF00084B 9.45 6.400e-09 656-668 

786 

PF00084 

Sushi domain proteins (SCR repeat 
proteins. 

PF00084B 9.45 7. 188e- 10 595-607 
PF00084B 9.45 6.400e-09 656-668 

787 

BL00826 

MARCKS family proteins. 

BL00826C 7.63 6.738e-09203- 
230 

788 

PR00453 

VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 

PR00453A 12.79 1.310e-14 36-54 
PR00453B 14.65 8.568e- 10 75-90 

789 

PR00102 

ORNITHINE 

CARBAMOYLTRANSFERASE 
SIGNATURE 

PR00102B 14.82 5.418e-09 963- 
977 

790 

BL00030 

Eukaryotic RNA-binding region RNP-1 
proteins. 

BL00030B 7.03 5.500e-l 1 199- 
209 

791 

BL00415 

Synapsins proteins. 

BL004I5N 4.29 9.519e-10 393- 
437 BL00415N 4.29 2.1 17e-09 
103-147 BL00415N4.29 3.628e- 
09 97-141 BL00415N4.29 
5.664e-09 387-431 

i 795 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 2.091e-36 105-144 

799 

PF00731 

AIR carboxylase. 

PF00731C 23.16 7.333e-35 337- 
380 PF00731B 19.47 7.429e-28 
299-336 PF00731A.19.32 6.333e- 
24 268-297 

804 

BL00170 

Cyclophilin-type peptidyl-prolyl cis-trans 
isomerase signatur. 

BL00170B 20.97 8.071e-09 297- 
337 

805 

BL00678 

Trp-Asp (WD) repeat proteins proteins. 

BL00678 9.67 3.400e-10 378-389 
BL00678 9.67 5.800e-10 418-429 
BL00678 9.67 8.800e-l 0295-306 

oOo 

FD01719 

PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 

PD01719A 12.89 7.571e-I4 290- 
318 

80/ 

PK00320 

G-PROTEIN BETA WD -40 REPEAT 
SIGNATURE 

PR00320B 12.19 9.100e-09 451- 
466 

809 

T> T f\{\ i r\T 

BL00107 

Protem kinases ATP-binding region 
proteins. 

BL00107A 18.39 4.462e-12 564- 
595 

810 

TiT> A A /I O 

PR00453 

VON WILLEBRAND FACTOR TYPE 
A DOMAIN SIGNATURE 

PR00453A 12.79 L310e-14 36-54 
PR00453B 14.65 8.568e- 10 75-90 

814 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 2.047e-31 16-55 

ol3 

r Ltv 1 Uoo 

rKUlHIN ZJiNC rJLNUfcK ZiNC- 

FINGER METAL-BINDING NU. 

rDOIuoo 1 9.43 2.047e-3 1 16-55 

817 

PR00193 

MYOSIN HEAVY CHAIN 
SIGNATURE 

PR00193D 14.36 5.154e-36 125- 
154 PR00193E 19.47 3.919e-18 
179-208 

818 

PR00830 

ENDOPEPTTDASE LA (LON) SERINE 

PR00830A 8.41 9.571e-ll 115- 


178 


WO 01/57190 


PCT/US01/04098 


SEQ 
ID 
NO: 

ACCESSION 
NO. 

DESCRIPTION 

RESULTS* 



PROTEASE (SI 6) SIGNATURE 

135 

R 1 Q 
o ly 

DLUU1ZO 

3'5'-cycIic nucleotide phosphodiesterases 
proteins. 

di nnn/^r m n q<7 q o/i *co<? 
569 BL00126E 35.22 3.7 14e- 15 
669-724 BL00126D 25.50 L173e- 

1 4 ^ ft/1 6T>1 RT Ofl 1 9£R 1 ^ Ofi 

1.000e-12 502-514 BL00126A 

97 1 Ifilp 00 461 4QR 

820 

PR00511 

TEKTIN SIGNATURE 

PR00511B 12.25 8.826e-22 174- 

1 OS PttOO*? HA 1 ^ 5Q 7 79^p 1 1 ' 

155-172 

821 

ft I 00741 

fiiianinp- nnp|pr»tiH*> Hie crwMut inn 

VJUailillC'liUCiCUllUC UlobUUlallUll 

stimulators CDC24 family sign. 

RT 0074 1 R 14 97? ROOe- 1 S 1 1-16 

822 

PF007R0 

Tlftfnoiri Triiin/T it*i \TT 1 Vinncoo 
l^UIIldni 1UUI1U Ul INlJVl-lUvC KUlaoCo, 

mouse citron and yeast ROM. 

PTW17R0T 14 60 4 R9^p> OQ931 

261 

827 

BL00030 

Eukaryotic RNA-binding region RNP-1 
proteins. 

BL00030A 14.39 5.235e-l 1 144- 
163 

828 

BL00326 

Tropomyosins proteins. 

BL00326D8.76 9.357e-ll 545- 
586 

829 

PD02448 

TRANSCRIPTION PROTEIN DNA- 
BINDIN. 

PD02448A 9.37 1.000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 

1 CO 1 On TlT~"\/'\ / *> A A OF 11 0 "> r\ (\f\f\ 

152-189 PU02448E 1 1.33 9.000e- 
30 235-261 PD02448F 14.22 
y.o;>4e-25 z/y-JUJ rIX)z44oJL> 
11.48 3.659e-18 197-211 
PD02448G 10.73 7,857e-16 305- 

J 10 

R70 


— — : — — — : — : 

Guanme-nucleotide dissociation 
stimulators CDC25 family sign. 

RT AfmflR 1/£ *\*7 A Kf\C\a TJ /tC2 

dJlaJU/zUd 10. j/ 4.jUue-Z3 4o_5- 
507 

Oj i 

rt nntm 

Protein kinases ATP-binding region 
proteins. 

RTflAIATA 1 ft 1Q A. AT^o O 1 I/IO 

JtSLUUlU/A lo.jy o.Ozje-zl 14J- 
174 BL00107B 13.31 4.214e-10 

917 990 

z l j-zzy 

832 

BL00215 

Mitochondrial energy transfer proteins. 

BL00215A 15.82 5.787e-l 1 32-57 

RT1 

PR 00407 

P40 SIGNATURE 

PI? 00407 A 09 4 17S#*_0Q 41 ^0 

R74 

RT 00790 

l au ano MAr proteins lUDunn-Dinoing 
domain proteins. 

DJLUUZzyA zj.D / y.joje-iu yy- 
138 

835 

BL00421 

Transmembrane 4 family proteins. 

BL00421E 20.97 2.216e-09 1053- 
1083 

836 

. BL00795 

Involucrin proteins. 

BL00795B 12.41 7.931e-09 405- 
445 

R77 

po fin nofi 

MAM DUMA1JN MUJNAl UKb 

DDArtAIAA 1© 1*7 1 AAA/. 1*7 1/1 

rxvUUuzUA 15.1 / l.UUUe-1 / 

PR00020B 15.52 5.846e-16 68-85 

PP00090n f 9 70 9 ^47*» 1 ^ 147 
rKUUV/ZUJJ 1Z. /U z.j^fje-1 J i*\ /- 

162 PR00020C 13.66 3.483e-13 

OS- 107 PP00090F 8 64 6 SR6P-17 

165-179 

838 

BL50017 

Death domain proteins profile. 

BL50017B 17.60 6.897e- 13 1499- 
1515 

839 

PF00850 

Histone deacetylase family. 

PF00850C 14.55 9.542e-09 1352- 
1 Joy 

840 

PF00023 

Ank repeat proteins. 

PF00023A 16.03 4.500e-12 44-60 
PF00023B 14.20 7.923e-ll 73-83 

PF00097R 14 90 0 000^-10 17Q 

149 PF00023B 14.20 5.500e-09 
40-50 

842 

BL01194 

Ribosomal protein L15e proteins. 

BL01194B 13.66 L000e-40 37-85 
BL01194C 12.35 9.250e-40 103- 
138 BL01194A 18.70 7.632e-38 
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2-37 BL01194D 19.02 2.658e-36 
139-178 

843 

BL00610 

Sodiumrneurotransmitter symporter 
family proteins. 

BL00610A 17.73 1 .000e-40 40-90 
BL00610B 23.65 1.000e-40 104- 
154 BL00610C 12.94 L000e-40 
206-258 BL00610E 20.34 l.OOOe- 
40 355-398 BL00610F29.02 
1.000e-40 454-509 BL00610D 
20.97 6.063e-35 272-325 
BL00610G 12.89 8.588e-13 514- 
537 

845 

BL00143 

Insulinase family, zinc-binding region 
proteins. 

BL00I43A 20.91 4.300e-20 94- 
121 BL00143C 14.16 5.500e-13 
245-258 BL00143B 14.41 9.053e- 
10 141-156 

846 

PR00543 

OESTROGEN RECEPTOR 
SIGNATURE 

PR00543D 10.87 L355e-09 898- 
914 

847 

PR00543 

OESTROGEN RECEPTOR 
SIGNATURE 

PR00543D 10.87 1.355e-09 898- 
914 

848 

BL00824 

Elongation factor 1 beta/beta'/delta chain 
proteins. 

BL00824C 14.58 1.000e-40 129- 
167 BL00824D 14.04 6.192e-39 
167-202 BL00824B 9.21 2.080e- 
21 96-116 BL00824E 12.49 
3.333e-19 210-226 BL00824A 
13.78 8.650e-14 19-34 

849 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 1.000e-40 12-51 

850 

PDO1066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 7.316e-24 10-49 

852 

BL01272 

Glucokinase regulatory protein family 
proteins. 

BL01272B 19.61 6.870e-30 136- 
171 BL01272C 11.68 3.3 14e-25 
249-274 BL01272A6.49 1.23 le- 
18 99-117 

853 

PD00930 

PROTEIN GTPASE DOMAIN 
ACTIVATION. 

PD00930B 33.72 9.341e-20 65- 
106 

854 

PD00289 

PROTEIN SH3 DOMAIN REPEAT 
PRESYNA. 

PD00289 9.97 6.850e-ll 140-154 

858 

PR00450 

RECOVERIN FAMILY SIGNATURE 

PR00450C 12.22 3.250e-25 68-90 
PR00450B 1 1.76 8.125e-23 22-42 
PR00450D 16.58 8.920e-22 92- 
112 PR00450E 12.14 1.58 le-19 
114-133 PR00450G 15.33 5.500e- 
19 166-187 PR00450F 12.30 
4.375e-15 140-156 PR00450A 
13.58 1.857e-14 8-23 

860 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 7.188e-27 74-1 17 

866 

BL00477 

Alpha-2-macroglobulin family thiolester 
region proteins. 

BL00477L 23.51 7.480e-20 54-87 

867 

BL01078 

Molybdenum cofactor biosynthesis 
proteins. 

BL01078B 14.20 1.621e-20408- 
429 BL01078A 10.16 2.000e-13 
366-379 BL01078D5.99 3.455e- 
11 566-576 BL01078C 10.52 
3.793e-ll 501-513 

868 

BL01177 

Anaphylatoxin domain proteins. 

BL01 177E 20.64 5.800e-24 462- 
489 BL01177C 17.39 5.333e-19 
416-435 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e-15 441-459 

869 

BL01177 

Anaphylatoxin domain proteins. 

BL01 177E 20.64 5.800e-24 415- 
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442 BL01177C 17.39 5.333e-19 
369-388 BL01177B 13.61 7.840e- 
16 122-138 BL01177D 17.50 
1.900e- 15 394-412 

871 

BL50007 

X llv/dpllallviy lUlUMlUl-bpCL'Ii.lL' 

phospholipase X-box domain proteins 
prof. 

rt ^oftfWA io f%\ i nnn<=» ac\ no 
dljuuu / r\ i".oi i .uuue-^fu JZZ- 

368 BL50007D 19.54 1.000e-40 

S8Q-fi^l RT S0007R 7f) Qfi fi 700^ 
36 383-421 BL50007E 25.63 
9.053e-33 748-785 BL50007C 
8.97 5.200e-19 452-469 

872 

BL00972 

Ubiauitin carhoxvl-terminal hvrirolases 

Vl/iMUlllll VIU UVA.J 1 ivlillUlUl U JUl UlCiJwtJ 

family 2 proteins. 

BL00972D 22 55 3 250e-17 90- 
115 

874 

PR00452 

SH3 DOMAIN SIGNATURE 

PR00452B 1 1.65 4.250e-09 370- 
386 

877 

BL00741 

Guanine-nucleotide dissociation 
stimulators CDC24 familv siffn 

BL00741B 14.27 5.500e-13 1343- 
1366 

878 

DM00215 

PROLINE-RICH PROTEIN 3 

DM00215 19 43 2 525e-09 59-RS 

881 

PD02807 

APOLIPOPROTEIN E PRECURSOR 
APO-E GLYCOPROTEIN PLAS. 

PD02807E 10.90 4/702e-09 358- 
407 

882 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 7.188e-37 8-47 

885 

PF00023 

Ank repeat proteins. 

PF00023A 16.03 8.071e-09 10-26 

886 

PR00372 

BIOPTERIN-DEPEN DENT 
AROMATIC AMINO ACID 

rl I IJIvL/A I L»Aoii oluINA 1 UKc 

PR00372B 10.30 9.308e-27 225- 
248 PR00372A 13.39 7.000e-24 

rKUuJ/2b 12.62 2. 125e- 
23 360-380 PR00372C 7.90 
3.025e-22 289-309 PR00372F 

11 OQ f, lllo 71 /fly! 

PR00372D 10.22 1.000e-19 329- 

j*+o 

887 

BL00301 

GTP-binding elongation factors proteins. 

BL00301B 20.09 2.800e-24 103- 
21-33 

888 

BL00518 

7inc finder C^HCM tvnp fRINfi flnaeri 

Zm»uiv^ luijpd , v^jiiv^t type vi i n vJ im^ciy., 

proteins. 

RT 00 S 1 X 1 7 9^ 1 6fi7p-.nQ 1ft 1Q 

889 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 4.906e-26 6-45 

890 

DM00179 

w KINASE ALPHA ADHESION T- 
CELL. 

DM00179 13.97 7.652e-09 113- 
191 

892 

BL01022 

PTR2 family proton/oligopeptide 

BL01022B 22.19 6.016e-14 72- 

118 RTftlft77F71S1 1 171f»-19 

472-508 BL01022All.58 9.135e- 
12 42-61 BL01022D 9 42 3 4S5e- 
11 199-212 

893 

PD02407 

3-BISPHOSPHOGLYCERATE- 
INDEPENDENT PHOSPHOGLYCER. 

PD02407K 12.59 6.529e-10 360- 
383 

894 

PD02407 

3 -BISPHOSPHOGL YCERATE- 
INDEPENDENT PHOSPHOGLYCER. 

PD02407K 12.59 6.529e-10 360- 
383 

895 

PR00237 

RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 

PR00237B 13.50 9.100e-14 1 16- 

138 PR00237F 13.57 1.360e-13 
217 hi VDfiminn io <i o o£Qa 

13 353-380 PR00237E 13.03 
7. 120e-12 243-267 PR00237D 
8 94 4 150e-ll 194-216 
PR00237A 11.4S4.375e-ll 83- 
108 

896 

BL00129 

Glycosyl hydrolases family 31 proteins. 

BL00129D 16.76 8.258e-26 634- 
678 BLQ0129A 26.21 1.720e-25 
384-430 BL00I29E 22.60 4.857e- 
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23 698-734 BL00129C 15.12 
1.750e-22 596-624 BL00129B 
19.19 5.89 1 e- 1 8 495-522 
BL00129F 26.19 7.545e-I5 814- 
852 

897 

BL00598 

Chromo domain proteins. 

BL00598 14.45 1.220e-13 9-31 

898 

BL00518 

Zinc finger, C3HC4 type (RING finger), 
proteins. 

BL00518 12.23 6.000e-09 396-405 

899 

nrvft 11/11 

PD01 101 

INHIBITOR HEAVY CHAIN 
CHANNEL IN. 

PD01101B 21.53 l.OOOe-40 274- 
327 PD01 101 D 24.45 1.000e-40 
457-512 PD01101A 18.25 6.268e- 
23 83-H7 PDOIIOIC 12.69 
1.237e-l 6 366-386 PD01101E 
6.73 7.750e-12 566-576 

Gf\f\ 

rKUUoUO 

T>T> ATP TXT TiT TrvOTkl T A *T" A OC nHI A Cf 1/ l^v 

PROIEIN PHOSPHATASE PP2A 55KD 
REGULATORY SUBUNIT 
SIGNATURE 

PR00600A 11.61 5.979e-09 31-52 

901 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 

PD01066 19.43 8.1 16e-31 24-63 

903 

BL011I5 

GTP-binding nuclear protein ran proteins. 

BL0I115A 10.22 I.509e-1 1 21-65 

906 

DM00215 

PROLINE-RICH PROTEIN 3. 

DM00215 19.43 2.174e-13 539- 
572 DM002I5 19.43 4.750e- 12 
549-582 DM00215 19.43 9.824e- 
11 551-584 DM00215 19.43 
2.929e- 10 548-581 DM00215 
19.43 4.054e- 10 550-583 
DM00215 19.43 5.339e-10 552- 
585 DM00215 19.43 7.107e-10 
544-577 

90/ 

PR0O988 

URIDINE KINASE SIGNATURE 

PR00988A 6.39 6.276e-12 314- 
332 

90o 

BLOOI07 

Protein kinases A TP-binding region 
proteins. 

BL00107A 18.39 5.950e-17 1125- 
1156 

909 

ni r\f\ | f\i-t 

BL0O 1 07 

Protein kinases ATP-binding region 
proteins. 

BL00107A 18.39 5.950e~17 1118- 
1149 


BLUU 1 U / 

Protein kinases ATP-binding region 
proteins. 

BL00107A 18.39 8.560e-13 150- 

1 O 1 

911 

BL00107 

Protein kinases ATP-binding region 
proteins. 

BL00107A 18.39 8.560e-13 150- 
181 

912 

. PF00856 

SET domain proteins. 

PF00856A 26.14 4.553e-ll 243- 
280 

913 

PF00628 

PHD-finger. 

PF00628 15.84 6.400e-13 197-212 

914 

PR00962 

LETHAL(2) GIANT LARVAE 
PROTEIN SIGNATURE 

PR0O962D 10.40 1.000e-27 435- 
459 PR00962G 15.71 4.086e-26 
593-618 PR00962B 11.98 9.122e- 
26 296-319 PR00962A 13.28 
6.143e-22 15-34 PR00962C 8.00 
4.000e-21 348-369 PR00962F 
12.39 9.769e-21 552-572 
PR00962H 13.32 2.636e-20 623- 
643 PR00962I 11.68 9.786e-20 
692-712 PR00962E 8.81 2.915e- 
18 515-534 



T "CTU A T 0\ /"M AXTT T A t>\/ A TJ? 

LJb 1 HAL(2 ) UlAN 1 LARVAL 
PROTEIN SIGNATURE 

PR00962D 10.40 1 .00Oe-27 365- 
389 PR00962G 15.71 4.086e-26 
523-548 PR00962A 13.28 6.143e- 
22 15-34 PR00962C8.004.000e- 
21 278-299 PR00962F 12.39 
9.769e-21 482-502 PR00962H 
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17 77 7 &1&f> 70 ^7 *\77 

1 j.jz z.ojoe-zu jjj-j/j 
PR00962I 1 1.68 9.786e-20 622- 
642 PR00962E8.812.915e-18 

44S-464 

916 

BL00134 

Serine proteases, trypsin family, histidine 
proieins. 

BL00134A 1 1.96 5.886e- 14 90- 

y 1 / 

DI AH/179 

LIM domain proteins. 

D\J\J\) L \ 1 OD IH. I y O.J7JC- 1 j Z 1 1- 

226 BL00478B 14.79 6.712e-10 

918 

PR00049 

WILM'S TUMOUR PROTEIN 
SIGNATURE 

PR00049D 0.00 5.729e-09 973- 
988 

922 

BL00150 

Acylphosphatase proteins. 

BL00150 25.33 1.000e-40 37-84 

924 

DM00031 

IMMUNOGLOBULIN V REGION. 

DM00031B 15.41 8.063e-09 79- 

I IJ 

925 

BL00072 

Acyl-CoA dehydrogenases proteins. 

BL00072D 30.08 2.837e-24 280- 
331 BL00072E 24.12 8.200e-24 
368-411 BL00072C 25.30 7.873e- 

70 77£ 7£7 RT OAA77R O /1ft 
ZUZZO-ZO/ DL.UUU /ZtS y.Ho 

6.049e-12 183-196 

927 

BL00237 

G-protein coupled receptors proteins. 

BL00237C 13.19 L692e-13 229- 

7<£ m O0777A 77 £ft £ £<7o 17 
ZjO oJUUUZj /A Z /.Oo O.Oj /e-i j 

90-130 BL00237D 11.23 9.57 le- 

1 7 700 707 

928 

BL01033 

Globins profile. 

BL01033A 16.94 7.923e- 18 25-47 
BL01033B 13.81 1.000e-15 93- 
105 

929 

BL00216 

Sugar transport proteins. 

BL00216B 27.64 8.714e-13 203- 

ZJJ 

932 

BL00415 

Synapsins proteins. 

BL00415N 4.29 9.519e-10 353- 
397 BL00415N 4.29 2.1 17e-09 
63-107 BL00415N 4.29 3.628e-09 
57-101 BL00415N4.29 5.664e-09 
347-391 

933 

PD02448 

TRANSCRIPTION PROTEIN DNA- 
BINDIN. 

PD02448A 9.37 1 .000e-40 46-85 
PD02448B 10.17 1.000e-40 85- 
133 PD02448C 13.62 1.000e-40 

i co i on Dnno/i a ot? 11110 nfiAa 
15z-io9 rU0z44ob 1 1 .33 y.OUUe- 

30 223-249 PD02448F 14.22 

O A^Ao 7< 7A7 701 T>TW>44ftFl 

11.48 3.659e-l 8 197-211 

PD0744Sn 10 I'X 1 R^7p-16 90^- 

306 

074 

LAlViUU 1 y 1 

SPAC8A4.05C DAUNORUBICIN. 

nMOniQIF) 1 1 04 0 OJttp-10 1^6- 

ULVX\J\J I y l U 1 7.UOJC~i w uu 

175 

935 

BL01115 

GTP-binding nuclear protein ran proteins. 

BL01 1 15A 10.22 4.696e-10 67- 
111 

936 

BL00019 

Actin in-type actin-binding domain 
proteins. 

BL00019D 15.33 8.138e-14 865- 
895 

937 

PR00762 

CHLORIDE CHANNEL SIGNATURE 

PR00762A 14.22 4.000e-22 183- 
201 PR00762C9.29 1.000e-21 
268-288 PR00762E 12.07 3.250e- 

OA COA COT DDrtrtn/Cor\ 1 1 70 

zU DzUO J / r KUU /ozD 1 1 .zy 

1 Aftfta 1G/170 4Q1 PD007£7I? 

15 12 1 429e-19 <R8-558 
PR00762B 12.12 1.818e-18 214- 
234 PR00762G 14.13 3.455e-17 
577-592 

938 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 9.500e-25 291-334 

939 

DM01111 

4 kw PHOSPHATASE 

DM01 11 IE 17.28 1.568e-10248- 
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TRANSFORMING 6 1 K PDF 1 . 

297 DM01 11 IE 17.28 5.168e-10 
659-708 DM01111D 16.76 
5.263e-09 279-325 DM01111M 
10.67 8.674e-09 911-935 

940 

BL00I07 

Protein kinases A TP-binding region 
proteins. 

BL00107B 13.31 LOOOe-14 293- 
309 BL00107A 18.39 6.760e-13 
229-260 

942 

BL01160 

Kinesin light chain repeat proteins. 

BL01160B 19.54 9.832e-l 1 543- 
597 

1 943 

PD01066 

PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NIL 

PD01066 19.43 3.500e-35 8-47 

945 

BL00989 

Clathrin adaptor complexes small chain 
proteins. 

BL00989B 26.51 1.000e-40 66- 
117 BL00989A 11.66 1.000e-13 
5-19 

946 

PR00178 

FATTY ACID-BINDING PROTEIN 
SIGNATURE 

PR00178D 13.52 9.57 le-09 450- 
469 

947 

BL00178 

Aminoacyl-transfer RNA synthetases 
class-I proteins. 

BL00178B 7.11 4.857e-09 713- 
724 

948 

PF00628 

PHD-finger. 

PF00628 15.84 8.412e-14 201-216 

951 

BL00216 

Sugar transport proteins. 

BL00216B 27.64 2.050e-10 180- 
230 

952 

PR00926 

MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 

PR00926F 17.75 4.300e-l 1 26-49 
PR00926F 17.75 6.348e-09 134- 
157 

955 

PF00109 

Beta-ketoacyl synthase. 

PF00109 13.08 2.846e-12 342-357 

957 

PR00069 

ALDO-KETO REDUCTASE 
SIGNATURE 

PR00069A 16.01 8.826e-24 26-5 1 
PR00069B 11.33 1.514e-17 86- 
105 PR00069C 16.03 8.816e-14 
155-173 

958 

PF00583 

Acetyltransferase (GNAT) family. 

PF00583A 12.53 5.500e-10 631- 
642 

961 

PR00328 

GTP-BINDING SARI PROTEIN 
SIGNATURE 

PR00328A 10.62 8.740e-10 7-31 

962 

BL00354 

HMG-I and HMG-Y DNA-binding 
domaui proteins (A+T-hook). 

BL00354A 3.83 9.438e-10 1489- 
1499 

963 

BL00354 

HMG-I and HMG-Y DNA-binding 
domaui proteins (A+T-hook). 

BL00354A 3.83 9.438e-10 1489- 
1499 

964 

BL00027 

'Homeobox' domain proteins. 

BL00027 26.43 7.I88e-27 53-96 

965 

PF00992 

Troponin. 

PF00992A 16.67 2.42 le-09 581- 
616 

966 

PR00515 

5-HYDROXYTRYPTAMINE IF 
RECEPTOR SIGNATURE 

PR00515D7.91 5.741e-09 13-33 

967 

BL00579 

Ribosomal protein L29 proteins. 

BL00579B 21.99 5.065e-21 164- 
194 

970 

BL00504 

Fumarate reductase / succinate 
dehydrogenase FAD-binding site 
proteins. 

BL00504C 18.68 2.227e-24 34-59 
BL00504D 10.43 7.261e-21 75-93 

973 

PF00580 

UvrD/REP helicase. 

PF00580A 13.37 4.720e-09 249- 
271 

974 

PR00456 

RIBOSOMAL PROTEIN P2 
SIGNATURE 

PR00456F5.86 LOOOe- 10 242-254 

975 

BL00237 

G-protein coupled receptors proteins. 

BL00237A 27.68 4.429e-22 99- 
139 

y /o 

pi nnm i 
r>LrUUU3 1 

Nuclear hormones receptors DNA- 
binding region proteins. 

BL00031A 19.55 7.158e-33 60-93 
BL00031B 22.25 5.500e-28 94- 
126 

977 

PD00066 

PROTEIN ZINC-FINGER METAL- 
BINDI. 

PD00066 13.92 8.200e-16 196-209 
PD00066 13.92 8.200e- 16 336-349 
>D00066 13.92 2.385e-15 476-489 
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PD00066 13.92 9.308e-15 252-265 
PD00066 13.92 2.800e-14 448-461 
rUOUOoo li.yz 4.600e-14 392-405 
PD00066 13.92 5.200e-14 280-293 
PD00066 13.92 4.000e-13 224-237 
PD00066 13.92 4.429e-12 308-321 
PD00066 13.92 9.571e-12 420-433 
PD00066 13.92 6.870e-ll 168-181 

978 

BL00721 

Formate— tetrahydrofolate ligase proteins. 

BL00721B 13.21 1 .000e-40 346- 
401 BL00721D 13.90 1.000e-40 
538-592 BL00721E 13.46 l.OOOe- 
40 597-646 BL007211 18.79 
2.500e-40 814-860 BL00721H 
2 1 .20 8.239e-39 763-8 14 
BL00721A 15.31 9.719e-32 287- 
32 1 iSLUU 72 1 L 1 6.92 4.000e-30 
498-535 BL00721F 15.96 8.232e- 
27 660-702 BL00721G 7.97 
3.017e-10 721-734 

981 

PD00126 

PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 

PD00126A 22.53 2.552e-09 180- 
201 

982 

BL00869 

Renal dipeptidase proteins. 

BL00869C 12.58 3.172e-19 59-95 
BL00869E 13.12 9.129e-18 120- 
157 BL00869J 15.60 6.032e-17 
270-310 BL00869H 11.08 1.840e- 
16 219-242 BL00869G 13.55 
2.543e-l6 192-214 BL00869F 
12.77 7.031e-14 157-192 
BL00869I 12.92 3.274e-12 242- 

95-124 BL00869B 15.55 9.382e- 
10 31-61 

983 

PR00196 

ANNEXIN FAMILY SIGNATURE 

PR00196F 23.892.125e-09 92-108 

984 

BL00485 

Adenosine and AMP deaminase proteins. 

BL00485D 30.82 2.427e-10 154- 
209 


* Results include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 


TABLE 4 


SEQID 
NO: 

PFAM NAME 

DESCRIPTION 

p-value 

PFAM 
SCORE 

2 

lg 

Immunoglobulin domain 

3.9e-17 

60.3 

3 

HSP90 

Hsp90 protein 

0 

1548.4 

6 

tsp_l 

Thrombospondin type 1 domain 

0.002 

22.1 

7 

7tm_l 

7 transmembrane receptor (rhodopsin 
family) 

6.7e-08 

27.3 

9 

PWWP 

PWWP domain 

8.1e-16 

66.0 

12 

Clq 

Clq domain 

1.7e-26 

101.5 

13 

Clq 

Clq domain 

2e-20 

81.3 

14 

Aa_trans 

Transmembrane amino acid 

2.7e-42 

153.9 



transporter protein 



15 

El-E2_ATPase 

E1-E2 ATPase 

6.3e-124 

412.2 

16 

trypsin 

Trypsin 

1.2e-87 

278.6 

17 

ig 

Immunoglobulin domain 

7.6e-12 

43.2 

18 

lectin c 

Lectin C-type domain 

0.0003 

21.2 

20 

Alpha JLJucos 

Alpha-L-fucosidase 

l^e-217 

736.5 
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22 

pkinase 

Eukaryotic protein kinase domain 

3.3e-87 

303.1 

23 

pkinase 

Eukaryotic protein kinase domain 

2.7e-85 

296.8 

24 

pkinase 

Eukaryotic protein kinase domain 

2.7e-85 

296.8 

25 

ank 

Ank repeat 

5.5e-14 

59.9 

27 

pkinase 

Eukaryotic protein kinase domain 

1.5e-100 

347.4 

28 

spectrin 

Spectrin repeat 

4e-57 

203.2 

29 

spectrin 

Spectrin repeat 

4e-57 

203.2 

30 

WD40 

WD domain, G-beta repeat 

1.2e-07 

38.8 

33 

rrm 

RNA recognition motif. 

Ue-17 

72.2 

34 

rrm 

RNA recognition motif. 

l.le-17 

72.2 

36 

7tm_l 

7 transmembrane receptor (rhodopsin 
family) 

3e-36 

117.3 

37 

ank 

Ank repeat 

5.9e-25 

96.3 

38 

SRF-TF 

SRF-type transcription factor 

1.4e-36 

133.9 

40 

alk_phosphatase 

Alkaline phosphatase 

0 

1034.9 

44 

zf-C2H2 

Zinc finger, C2H2 type 

8.6e-103 

354.9 

45 

sugarjr 

Sugar (and other) transporter 

3.1e-08 

40.3 

47 

7tm_2 

7 transmembrane receptor (Secretin 
family) 

6.4e-79 

275.6 

50 

zf-C2H2 

Zinc finger, C2H2 type 

1.3e-98 

341.0 

51 

filament 

Intermediate filament proteins 

1.2e-176 

600.3 

52 

zf-C3HC4 

Zinc finger, C3HC4 type (RING 
finger) 

2.7e-10 

37.7 

53 

Cadherin_Cjer 
m 

Cadherin cytoplasmic region 

1.9e-94 

327.2 

54 

SJOO 

S-100/ICaBP type calcium binding 
domain 

5.2e-18 

73.3 

58 

inositol P 

Inositol monophosphatase family 

5e-13 

49.8 

59 

7tm_l 

7 transmembrane receptor (rhodopsin 
family) 

8.8e-46 

147.6 

60 

Kunitz_BPTI 

Kunitz/Bovine pancreatic trypsin 
inhibito 

3.7e-47 

148.6 

62 

DAD 

DAD family 

2.5e-74 

260.3 

63 

MOZ_SAS 

MOZ/SAS family 

5.9e-133 

455.1 

64 

MOZ SAS 

MOZ/SAS family 

1.7e-123 

423.6 

65 

ras 

Ras family 

9.3e-89 

308.3 

67 

Ham 1 p i ike 

Haml family 

3.7e-49 

176.7 

68 

7tm_l 

7 transmembrane receptor (rhodopsin 
family) 

5.2e-39 

126.1 

70 

zf-C2H2 

Zinc fmger, C2H2 type 

1.5e-112 

387.3 

71 

Peptidase_M41 

Peptidase family M41 

1.2e-110 

381.0 

72 

abhydrolase 

alpha/beta hydrolase fold 

9.8e-05 

26.5 

81 

K_tetra 

K+ channel tetramerisation domain 

0.022 

-16.8 

82 

pkinase 

Eukaryotic protein kinase domain 

5e-49 

176.3 

84 

AAA 

ATPases associated with various 
cellular act 

1.3e-77 

271.3 

85 

homeobox 

Homeobox domain 

1.4e-28 

108.3 

87 

TGF-beta 

Transforming growth factor beta like 

6.7e-68 

210.2 

91 

mitocarr 

Mitochondrial carrier proteins 

4.6e-57 

198.5 

95 

adenylatekinase 

Adenylate kinase 

l.le-15 

60.0 

96 

»g 

Immunoglobulin domain 

4.1e-20 

69.8 

99 

CNH 

CNH domain 

3.4e-120 

412.7 

100 

homeobox 

Homeobox domain 

7.4e-32 

119.3 

1 Al 

ZI-C2H2 

Zinc fmger, C2H2 type 

2.2e-47 

170.8 

102 

zf-C2H2 

Zinc finger, C2H2 type 

4.4e-89 

309.4 

103 

dynamin 

Dynamin family 

1.4e-150 

513.6 

104 

lectin c 

Lectin C-type domain 

4.2e-15 

63.6 

105 

lectin c 

Lectin C-type domain 

4.2e-15 

63.6 

108 

metalthio 

Metallothionein 

2e-25 

97.9 
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SEQID 

NO: 

PFAM NAME 

DESCRIPTION 

p-value 

PFAM 
SCORE 

112 

HSP20 

Hsp20/alpha crystallin family 

2.6e-20 

77.7 

115 

EF TS 

Elongation factor TS 

3.8e-63 

221.1 

116 

sugarjr 

Sugar (and other) transporter 

4e-63 

223.1 

118 

catalase 

Catalase 

0 

1 1 CO c\ 

1 158.9 

119 

UCH 

Ubiquitin carboxyl-terminal 
hydrolase, famil 

ie-10 

A A 

24 .4 

122 

metalthio 

Metallothionein 

2.8e-25 

97.4 

125 

adh short 

short chain dehydrogenase 

1.6e-45 

1 CA C 

164.6 

126 

KRAB 

KRAB box 

7.9e-25 

95.9 

127 

G-alpha 

G-protein alpha subunit 

le-249 

843.0 

128 

mito carr 

Mitochondrial carrier proteins 

2e-65 

227.2 

131 

EF1BD 

EF-1 guanine nucleotide exchange 
domain 

4.9e-53 

189.6 

132 

GYF 

GYF domain 

4.9e-28 

lOo.o 

133 

GYF 

GYF domain 

4.9e-28 

106.6 

134 

lipocalin 

Lipocalin / cytosolic fatty-acid 
binding pr 

2.1e-33 

119.1 

135 

pkinase 

Eukaryotic protein kinase domain 

3.3e~86 

299.8 

136 

ank 

Ank repeat 

2.2e-29 

111.1 

137 

IL8 

Small cytokines 
(intecrine/chemokine), inter 

3.1e-18 

65.2 

139 

pyridoxaldeC 

Pyridoxal-dependent decarboxylase 
conse 

0.00011 

19.0 

140 

cadherin 

Cadherin domain 

1.3e-88 

307.8 

142 

efhand 

EF hand 

5.7e-33 

123.0 

143 

Acyltransferase 

Acyltransferase 

2e-29 

111.2 

146 

cytochromec 

Cytochrome c 

1.7e-33 124.7 

147 

pkinase 

Eukaryotic protein kinase domain 

2.3e-86 

300.3 

148 

PDZ 

PDZ domain (Also known as DHR or 
GLGF). 

1.7e-09 

45.0 

149 

aldo_ket_red 

Aldo/keto reductase family 

7.4e-189 

640.8 

150 

homeobox 

Homeobox domain 

3.2e-08 

38.7 

151 

PseudoU synth 
1 

tRNA pseudouridine synthase 

4.7e-57 

203.0 

152 

abhydrolase 

alpha/beta hydrolase fold 

1.7e-31 

118.0 

153 

PDZ 

PDZ domain (Also known as DHR or 
GLGF). 

Lle-09 

45.6 

156 

PHD 

PHD-fmger 

7.6e-15 

62.8 

157 

fo3 

Fibronectin type III domain 

0.015 

21.9 

158 

homeobox 

Homeobox domain 

2.7e-27 

104.1 

160 

PWI 

PWI domain 

3.9e-24 

93.6 

162 

DnaJ 

DnaJ domain 

2e-06 

34.8 

164 

Cbl_N 

CBL proto-oncogene N-terminal 
domain 

8e-117 

401.5 

166 

metalthio 

Metallothionein 

3.1e-26 

100.6 

167 

LRR 

Leucine Rich Repeat 

0.00069 

26.3 

169 

fibrinogen_C 

Fibrinogen beta and gamma chains, 
C-term 

53e-180 

611.4 

170 

fibrinogenC 

Fibrinogen beta and gamma chains, 
C-term 

5.3e-180 

611.4 

171 

fibrinogenC 

Fibrinogen beta and gamma chains, 
C-term 

le-149 

510.8 

173 

homeobox 

Homeobox domain 

L5e-29 

111.6 

174 

FYVE 

FYVb zmc linger 

7.4e-28 

103.8 

175 

GRIP 

GRIP domain 

3.9e-08 

40.5 

182 

pkinase 

Eukaryotic protein kinase domain 

3.4e-71 

250.0 

185 

CAP GLY 

CAP-Gly domain 

5.6e-51 

182.8 

186 

TBC 

TBC domain 

2.2e-50 

180.8 

187 

TBC 

TBC domain 

2.2e-50 

180.8 
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SEQ ID 

NO: 

PFAM NAME 

DESCRIPTION 

p-value 

PFAM 
SCORE 

1 oo 


PDZ domain (Also known as DHR or 
uLur ). 

4e- 13 

57.0 


Reicn 

Reicn motir 

j.ze-iuo 

3o5.o 


Tropomyosin 

Tropomy os ins 

j.oe- i/i 

c^> c >i 

7QO 

iyz 

Rieske 

Kiesice [Zre-Zoj aomam 

u. uu io 

IO f 

I6.J 

199 

ig 

Immunoglobulin domain 

5.9e-19 

66.1 

2U2 

bur 

bur -like domain 

3.4e-54 

193.5 

203 

tretoil 

Ire roil (P-type) domain 

le-24 

95.5 

OA/1 
ZU4 

1 fcsC 

1 BC domain 

6.je-J8 

i oa r\ 

139.0 

ZUj 

etnana 

fc,r nana 

U.UU96 

22.6 

O A/C 

TCI/' ^U««v»i=kl 

Slow voltage-gated potassium 
channel 

A AA'J 1 

U.U031 

O 1 

5.1 

ZU / 

ucIOU 

Trefoil (P-type) domain 

z.ye-4o 

1 TT O 

1 15.1 

OAQ 

ZUV 

KiDosomal_o 1 5 

KiDOSomal protem MJ/olo 

i.ze-/o 

274 .7 

210 

hemopexin 

Hemopexin 

L3e-62 

221.5 

213 

Tor 1 v 

1 BC 

TBC domain 

2.5e-48 

174.0 

215 

Basic 

Myogenic Basic domain 

4.3e-50 

179.8 

216 

Ribosomal_L24 

KOW motif 

8.2e-23 

89.2 

222 

fh3 

Fibronectin type III domain 

7.3e-141 

481.4 

223 

cofilinADF 

Cofilin/tropomyosin-type actin- 
binding pr 

9.3e-47 

168.8 

224 

efhand 

EF hand 

6.1e-06 

33.2 

225 

Pterin_4a 

Pterin 4 alpha carbinolamine 
dehydratase 

9.3e-42 

152.1 

228 

ABC tran 

ABC transporter 

4.1e-110 

379.2 

234 

El_DerP2_DerF 
2 

El family 

3.7e-90 

312.9 

235 

El_DerP2_DerF 
2 

El family 

1.6e-48 

174.6 

237 

PMP22_Claudin 

PMP-22/EMP/MP20/Claudin family 

L7e-25 

98.1 

238 

Opiodsjieurope 
P 

Vertebrate endogenous opioids 
neurope 

1.8e-159 

543.2 

n a 

239 

«.TC C.™ 

Eukaryotic initiation factor 5A 
hypusine 

5.9e-104 

358.8 

Z4U 

— . — . — — — 

Amino oxidase 

Flavin containing amine oxidase 

O 1 1 

2._>e-l 1 

3 /.O 

243 

zf-C2H2 

Zinc finger, C2H2 type 

2.1e-99 

343.6 

O/M 

Z44 

ban a / 

SPFH domain / Band 7 tamily 

2.3e-53 

190.7 

24 J 

ank 

Ank repeat 

I.6e-88 

307.5 

246 

zf-C2H2 

Zinc finger, C2H2 type 

6.7e-49 

175.9 

A/IT 

247 

actin 

Actin 

2.3e-42 

140.3 

O/IO 

248 

ER_lumen_recep 
t 

ER lumen protein retaining receptor 

r"\ A ~ ICC 

2.4e-155 

529.5 

Oc,a 
ZDU 

rJVLrZZ_Oiauflin 

DA/CD OO /T7\ /f D/A /TDO A/OI^h/^i't^ fV. m ;i,, 

rMr-2Z/bMr/MrzO/Claudin tamily 

Z.Zeoo 

1 A A A 

ZjZ 

Collagen 

Collagen triple neux repeat (ZO 
copies) 

I.4e-13 

co 
JO.O 

255 

C2 

C2 domain 

0.052 

7.8 

ZD / 

pad m V 
tAr_uL Y 

CAr-uly aomam 

1.4e-20 

ol.o 

ZOU 

WJJ4U 

W1J aomam, o-oeta repeat 


O 1 o c 

218.5 

o/;i 

WD4U 

WL) domain, u-beta repeat 

n a^ /ro 

O 1 o c 

"7 /CO 
202 

YkfT\A(\ 

WJJ4U 

WD aomam, O-beta repeat 

A A— /TO 

O 1 o c 
ZIO.J 

Z0J 

« — . jti 1 1„ A TOT? 

coiilinAJDr 

Cofilin/tropomyosin-type actin- 
binding pr 

7.8e-21 

OA £L 

82.6 

Z04 

— — — 

Ribosomal_L 1 4 

Kioosomal protem L14p/L23e 

A A^. 1 A 

9.2e-lU 

/1A /C 

40.0 

265 

SAPA 

QsnACin A _tvnp rlr»m Qin 

oapuoin /v-iypc uuuidin 

*+. £ te - z / 


266 

SAPA 

Saposin A-type domain 

4.4e-27 

103.4 

267 

ABCtran 

ABC transporter 

9.5e-39 

142.2 

269 

Ribosomal_L14 

Ribosomal protein Ll4p/L23e 

6.2e-62 

219.2 

270 

abhydrolase 

alpha/beta hydrolase fold 

0.042 

-3.3 

272 

ras 

Ras family 

4.3e-87 

302.8 
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CPA 1T~\ 

oEQ ID 

NO: 

DC A ft It XT A Hyf'TT' 

FfrAIVl NAME 

DESCRIPTION 

p-value 

PFAM 

SCORE 

973 


ivin/\ recogniiion moiii. 

U.U IH 

1 H.O 

77S 

z / _/ 

linrkpn lin 

L<ipocdiiii / Lytosoiic iany-acia 

ninHirnr nr 

UlJiUlUk, JJI 

O C p /II 

z. je-4 1 

1 4A 4 

276 

ras 

RaQ familv 
ivoi i alii i i_y 

1 1p-^7 
i . 1 e-o / 

7^R ^ 

277 

UCH 

T Ihinnifin rflrhnwl-tprmina 1 

VJUI^UILIJi V/alUUAjI IvllllUldl 

hydrolase famil 

1 ?p-147 

1 It/ 

SO^ Q 

278 

START 

START domain 

3 2e-09 

44.1 

279 

WD40 

WT) domain G-hpfa rpnpat 

1 .8e-27 

104.7 

282 

G-patch 

(r-natch domain 

7.8e-22 

86.0 

287 

Anti oroliferat 

BTG1 famil v 

J-J 1 VJ A 1. CXI JLI 1 1 Y 

L2e-101 

351.0 

289 

KRAB 

KRAB box 

XXJVfvXJ UUA 

7 1p-?1 

SO R 
0Z..0 

293 

7tm 3 

7 transmpmhranp rpppntnr 

JJC" / J 


295 

SET 

SET domain 

5e-30 

113.2 

296 

Pvridox oxidase 

Pvridixvaminf nhncnhatp r»vida<ip 

I JTI JUVIACIlIlXllC J ^FliUoplldlC UAIUaoC 

l ^lp-7^ 

I JC* IKj 

7^8 0 

297 


X\JLN.rY I Cv»vl£llKlUIl IliUlIl. 


\fO 0 

298 

Ubiejrciethyltran 

ubiE/COQ5 methyltransferase family 

6.3e-05 

-96.3 

700 

T TVnp mpthvrltrciti 

uuie iiiciiiyiuan 

uoijri/v^ijv^j meinyiiransierase ramny 


1 10 1 

-J 15. 1 

^01 

JU 1 


r/\iJ/iN/\ij-Duiaing v^ytocmome 
reductase 

/ . / e-o l 

Z 1 J. J 

309 

vJ _ palv»Jl 

o-paicn Qomaui 

j.ie-14 

OU. / 

107 

7fm 1 

/ LIU I 

/ iransmemDrane receptor v rnouopsxn 

film ilv/ l 
Ld.lU.liy ) 

/. /e-4_> 

1 Jo.Z 

TOR 

PH 

xfl UUIIldlli 

0 OOI <\ 
U.UU J J 

19 » 
1 /.o 

T10 

7tm 1 

/ irdiiMueni crane recepior v inouopsin 

familvi 

i .^te-o^f 

990 9 
Z /U.O 

31 1 

R hndanpcp 

A VA1 \JKX CXI Ivio 

INJIUUcUlCoC ILKC UUillalil 

i ^p ftd 

J.JC-04 

996 7 
ZZO. / 

312 

tiihiilin 

Ti"iiSii1 in/Ff'c7 fumilv/ 
V liUUlUl/i IoZj lalillty 

A Op 78£ 

OAT A 

314 

SIJRF4 

< siTRF4 famil v 
oUivrt laiiuiy 

1 9a 1 OQ 

676 6 
0 /O.O 

325 

IMS 

imnR/mnrR/camR familx/ 
UIipo/lIlUUO/balllJD IdlllZJy 

?p ^8 

907 ^ 

327 

padhprin 

v^aUiici iii uuillaljl 

H.JC-7 l 

T 1 f\ 0 

329 

NAC 

lN.r\v^ LiuJllalil 

7 1p 7R 

1 07 R 
I vl / .0 

330 

IP trans 

XllVJ5JJIl<XlIUy ILllUbJLUt LI0JI0ICI piULCixi 

OJc-70 

1^8 7 

JJO. / 

332 

TFIIS 

Transrrintion fartnr ^-Tl ^TFH^ 
1 1 allow ijjiivjii idULUi 0 11 ^1 rno^ 

O.OC-UJ 

90 T 

337 

zf-C2H2 

7inr fmtrpr P9T47 U/np- 

J.OC-0 I 

916 6 

340 

AIRS 

ATT? QVritViacp r^IntpH orntpin 
<VXX\. i^nulaDC ICItlLCU |JJ Ult/lll 

4p-^7 

190 9 

1Z.V.Z 

343 

annexm 

AnnpYin 

xAJlllCAlll 

4 ^p-80 
t.uc o\y 

970 4 
z /y.H 

346 

Stathmin 

^f^thmin familv 
OLuLiLi iiiii lainu y 

1 Rp-00 

1 .OC-7V 

T 14 0 

347 

RibosomaI_L16 

Ribosomal protein L16 

4.6e-09 

34.9 

348 

laPtaTTlJlQP ft 
iui/i.uJiiaob X-/ 

iviciaiiU"OCia-idLidniaSC buperidiiiiiy 

0 019 

6 0 
-O.v 

351 

pfhand 

FF Ji?inrl 

9 Sp-14 

61 0 
0 1 -vl 

353 

lectin_c 

Lectin C-type domain 

1.3e-05 

32.1 

354 

WD40 

w uuiiidui, vj DCLa repeat. 

9 9p 1 R 

z.ze- 10 

74 ^ 

/H.J 

360 

linnpalin 

i-iipui/aiiii / cyiosoiic Ialiy-dClU 
binding nr 

o.je-iv 

TR T 

362 

Acetvl tra n sf 

ArprvltransfpraQP ^rrMAT^ famil v 
n.vb iy 1 u oil 0 ici doc ^vJ ix /V A j xaiiiiijr 

0 0010 

V.UV17 

94 0 

365 

tRNA-svnt 1 

tRNA svnthptacpc rlncc f (\ 1 IV/f unrl 
nvi tn oyiiiiiciaoCo ^lao^ 1 \1, JL», ivi aiiu 


69R 9 

366 

Sulfatase 

Sulfatase 

VI. IC Z.Z.O 

770 6 

/ / V/.VI 

368 

START 

START domain ! 

3 8e~l 1 

J.Ov 1 X 

50 5 ! 

369 

pkinase 

Flilcarvntip nrntpin lfinnQf* dnmain 

9 4p-10 

41 1 

370 

ACBP 

Anvl CnA hind in 0 nrntpin 
rx\*y 1 \^KJr\ uixltXlilg piuxvlll 

4 4p- < ir» 

H.HC JU 

100 7 

373 

pkinase 

Fnlrarvntip nrntpin l^inacp r\nm a in 
i-i uivcu yyjiiKj j/iuiCw iXLllaoC KSUHlaUl 

1 Ap-Q4 

377 5 

JZ / .-J 

373 

EGF 

FfrF-liL"p Hr\m ain 
x^vix iiivc Clwlllalil 

9 £p-19 
z.oe- iz 

S4 T 

375 

zf-C2H2 

Zinc finger, C2H2 type 

8.2e-64 

225.4 

377 

KRAB 

KRAB box 

3.7e-27 

103.7 

379 

SET 

SET domain 

7.3e-61 

215.6 

380 

Glyco transf 8 

Glycosyl transferase family 8 

0.0028 

-40.1 

381 

zf-C2H2 

Zinc finger, C2H2 type 

4.3e-06 

33.7 

383 

Glyco_transf_8 

Glycosyl transferase family 8 

0.0028 

-40.1 
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bhQ ID 

NO: 

rr AM INAIVlfc, 


p-value 

PFAM 

384 

RasGEF 

RasOFF domain 

IVCIjVJLiJ UUlllUlli 

8.1e-43 

1 7 

385 

TBC 

TBC! domain 

0.017 

-66 6 
uu.u 

389 

Glycos_transf_2 

Glycosyl transferases 

L3e-15 

65.3 

390 

Na Ca Fx 

I'd K^iCL UA 

^oHmm/fnlpiiim pv/^Kianopr nrntpJn 
OUU IU1I1/ LdltlUll 1 CAvIldllgCl JJ1 VJLt/Ul 


^69 7 

391 

fh3 

Fihrnnpr*Hn h/np TIT rlrnnain 
x iui uuci'iLii type ill Liuiiiaui 


1S9 6 

392 

fh3 

Pihronff*tin fvr>p TIT rlomain 

JL lUL WlltV^lllI ty LfC 1JLX ULUlldlll 



393 

fh3 

PiKronp^fin tvnp TTT rlomain 

X ll/I UllvV^Llll tyiJC 111 U vMl IdLlJ 



394 

Idl rpppnf h 

T riW-Hpn c it\f 1 ir\r»nrnt"p i n r^r pr^tf\r 

repeat 


17^ R 
1 / J.O 

395 

Ribosomal L30 

Ribosomal nrotein T,10o/L7e 

0.0023 

16.0 

396 

Oxysterol_BP 

Oxysterol-bindLag protein 

1.5e-94 

327.5 

397 

RDS ROM] 

Pprir>h prin/rnm- 1 


l^J.7 

399 

lactamase B 

lVfptallo-Kipfsi-lapf'am'acp ciinprfamilv 
ivnjLaiiu UCla idl/ldllldoC iUUtl ldiii liy 

J.tC-J7 

14^ 6 

402 

F-box 

F-Hov domain 

0 000? 

?R 1 

Z.0 . 1 

403 

CI^P nrotease 

f~*ln nrntpacp 

4 Xp-^4 

9?6 ? 

405 

Ribosomal IIS 
Ae 



960 0 

ZU7.U 

406 

LIM 

T .IA/f domain fnntainino nrotpinc 

1— <livi Uwllidlll V^VJll IdiJLllil^ L/l UlWLllo 

0 000? 1 

?0 7 

410 

tRNA-synt 1c 

tRNA <svntliptaQPQ pIsiqq T /~F and 

le-?^6 

1 C Z. JVJ 

70Q R 

411 

NTP tran sf ? 

^Jnf IpoHdvltran^fpracp dnmain 
i^i uv/ivviivijr in alio iciaoc uuiiidui 


67 0 

412 

DEAD 

DFAD/DFAH hoY hplioase 

0 00016 

17 ? 

414 

DUF94 

riomain rvf* unlfnnwn fiinftinn T^T TFOd 1 
XJ\JlLLai±l Ul UI11VJHJW11 lUHlsllUll XJ \J r 27 H 

0 0001 1 

U.WU 1 i 

96 0 

415 

tubulin 



071 7 

420 

SET 

SET domain 

3.3e-57 

203.5 

421 

WT>4ft 

VV L^f-iKJ 

yy u uomain, vj-ucia repeat 

^ 1 p 90 

1 OO A 

luy.o 

4?1 

7f.r?H7 

7inr» finnor fOU') hma 

z^mc linger, ^zriz rype 

i.oe-3y 

1 A/1 Q 

424 

nlriniicp 

XjUKaryoiic proiein Kiria.sc uomain 

o .ye- / j 


498 

1 TM 

LIM domain containing proteins 

1 8o 1A 

1 OA 9 

431 

kazal 

jv«i^<ii-[ypc senne proicosc mnioiior 
dnmain 

j./c-lo 

9^ Q 

432 

SH2 

Qrp hnmnlnov Hnmnin 0 
oiv iiwiiiuiu^y uLFiiidui 

1 4p_/ : ;7 

1QR 4 

433 

zf-C2H2 

7inc finopr tvnp 
t^tixiXt luigcij vy^nx lyiyv 

? ftp- 144 

409 7 

434 

ras 

Rac familv 

ivuo icuiiiiy 

0 01? 

-106 8 

436 

E1-E2 ATPase 

F1-F2 ATPase 

XJj J 1_/Z. /T. 1 1 doO 

1 fip- 1 1 7 

101 0 

437 

RNA_pol_A 

RNA polymerase alpha subunit 

0 

1077.7 

438 

PHD 

xxxj iiii^ci 


SI 7 

439 

lectinc 

Lectin C-type domain 

4.7e-30 

113.3 

440 

7f-P7H? 

7\r\c fin opt r?U9 Hmp 

^riii^ luigci, v^zriz type 

i la a<; 
i . i e-o j 

91 1 A 

441 

arrp^tin 

<U 1 Co III 1 

rtrrcbiui ^or o-<uiugcny 

9 Op O^A 

OJo.l 

442 

aminntryin 1 

AminntrflncfpracM r^lacc-TTT 
/Aiiiiiiuu aluXClaovo i/laoo~lJLl 

nvridoval-nho 


911 1 

J, j 1.1 

443 

UCH-1 

I Tbifliiitin carhowl-tprminal 

L/lvjUltlll V/dl lyWAjr 1 Ivl lllLlldl 

hydrolases famil 

O. JC" l£. 

S? 6 

444 

CTFNFI 

CTF/NF-I family 

2 6e-277 

934.6 

451 

T-box 

T-box 

3.8e-l 17 

402.6 

453 

Rieske 

Rieske T2Fe-2Sl domain 

2.6e-13 

57.7 

454 

zf-C2H2 

Zinc fineer C2H2 tvne 

3.9e-64 

226.5 

456 

homeobox 

Homeobox domain 

11V111VV/1/VA UvlilUUl 

2 8e-08 

18 9 

459 


Immunofflohiilin domain 

UIUKlUlUglVI/Ullll UV/llluUl 

2.6e-20 

7ft S 

460 

Hydrolase 

haloacid dehalogenase-like hydrolase 

4e-25 

96.9 

462 

rve 

Tntporacp pnrp Horn a in 

AlllCgldoC IrUlv UUUldlll 

1 .UC - 1 _> 

SO 7 

466 

CH 

i alnnnin hnmnlnmr Hnmoin 

^rdipuiiin nvmoiogy ) uuiiidin 

9 4p 17 

71 1 

467 

CH 

Calponin homology (CH) domain 

2.4e-17 

71.1 

468 

Sterol desat 

Sterol dp<jatiirase 

7.5e-38 

139.2 

469 

proisomerase 

Cyclophilin type peptidyl-prolyl cis- 
tr 

2.6e-63 

220.9 

470 

Peptidase M24 

metallopeptidase family M24 

6e-08 

28.1 

471 

PDZ 

PDZ domain (Also known as DHR or 
GLGF). 

5.4e-129 

441.9 
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4 /Z 

myD_UiN a- 
omcnng 

iVly U~ 11K.C L/|NA\-t/IIlUliig UUIlialll 

3.6e-06 

33.9 

4 /J 

77 ' 

7\r\r flnapr nrp<;pnt in dv<;tTonhin PR 

0.012 

20.0 

474 

EFlG_domain 

Elongation factor 1 gamma, 

vUlioci vcu uuuia 

6.3e-88 

305.5 

4 IJ 

IVJOOSOmd.1 JL»JlC 

Rihn<:nm;il nrntpin I .Tip 

6.1e-66 

232.5 

*\ /o 

Pin 

Pin Hnmain 

J U UUliiaLll 

2.5e-75 

263.7 

/1 77 
4/ / 

CU') 

CUl f)nm!iin 
OiiJ lHJillalll 

l.le-12 

55.6 

4/5 

lvlOa/\ lNlir>_r 4 

mna A / nifR / nnnF familv 

0.002 

-17.7 

470 

FYVF 
r i v c 

FYVF zinc fineer 

9.3e-21 

78.6 

4R0. 

F>NA no] A 

DNA polymerase family A 

2.3e-46 

167.4 

4R? 

HOZ 

clinrt 
aUil oilUl l 

<;hnrt chain dehvdropenase 

1.2e-62 

22 L6 

4R1 

ank 

An lr rpnpfit 

1.3e-17 

71.9 

4R4 
Ho** 


imnR/mnrR/samR familv 

2.2e-83 

290.5 

4R6 
4o0 

TTR 

TI"R Hnmnin 

1 IJX UV*Illalll 

3.2e-19 

67.8 

AR7 
** o / 

FMO likp 

Flavin-hinrlinP monooxvffenase-like 

0 

1425.5 

488 

I LWEQ 

1/LWEQ domain 

9.5e-101 

341.0 

4yj 

homeobox 

nuillcUlAM. uuiiiaui 

3.6e-06 

30.8 

4y / 

pktnase 

CUKaJy UllU UiULCUi Kit loot; uuiiiaui 

2.3e-166 

566.1 

A GO 

4yy 


Pi V\T"/^r»<^r*tin fA/r\f* III (\ C\TW C\\W 

r lUlvlliCV/illi Vjy\* in uuiiiaui 

2.5e-237 

801.8 

501 

LRR 

Leucine Rich Repeat 

93e-31 

115.6 

502 

KCjo 

Keguiaior oi o proiem signdiiiig 
domain 

0.041 

11.9 

503 

filament 

intermediate n lament proieuia 

le-142 

487.5 

505 

in3 

r l Drone ctin type ill aomain 

1.3e-100 

347.7 

506 

trppT 

HhCi 

rite i -aomain ^uDiquiun- 
transferase). 

le-13 

59.0 

50/ 

rr p.-- 1 - — 

RjbosomaML7A 
e 

xvioosomdi proiein i>//\c 

5.7e-26 

99.7 

jUo 


AAAO Hnmain O-bpta reneat 

0.063 

19.8 

509 

WD40 

WD domain, G-beta repeat 

0.063 

19.8 

MU 

W1J4U 

VV \J UUIlialll, \J UCld icpcal 

2.1e-42 

154.3 

Ml 

pkinase 

Ciil/'orx/rtfir* nrntpin k'iticiQP Hnmain 
tjUKdjyUllU piUlClil MilaoC uv/iiiam 

2.3e-86 

300.4 

J IZ 

o- gamma 

\JVJ1_« UUIlialll 

1.9e-08 

34.3 



jUj UUIlialll 

3e-06 

34.2 

DID 

uTtr a rc>r 
nlrl /vrdv^ 

Rctr'tprisil rponlatnrv hplix-tlltn-helix 

protet 

3.9e-27 

103.6 

J 10 

v f polio 
zi-^zriz 

7\nc fina^r C7H7 tvne 

L7e-34 

128.0 

j l / 

o 1 

^1 RNA binding domain 

6.1e-58 

205.9 

^1 R 

r\l/"iT"iQOP 

FnVarvntir nrot^in kinase domain 

1.8e-75 

264.2 

JZ.J 

r~nHhprtn 

Padherin domain 

2e-80 

280.6 



Zinc fineer C2H2 tvne 

4e-70 

246.4 


npur rhan 

Neurotransmitter-gated ion-channel 

5.8e-222 

750.8 


RhnfiFF 

RhoGEF domain 

3.5e-44 

160.2 


lYivr^^in Viparl 
IIlYUDlll llvCU4 

Mvosin head f motor domain^ 

0 

1494.5 


T RR 

T purine Rich Reneat 

8.3 e- 15 

62.6 

535 

Sec7 

Sec7 domain 

5.1e-92 

319.1 

jjO 

nomeuouA 

UnmpnhnY Hnmain 
nuiucuuuA uuiiidiii 

4.8e-05 

26.4 

539 

actin 

Actin 

2.4e-100 

330.6 

54Z 

anK 

AnK repeat 

1.9e-35 

131.2 

D44 

_f r^ppi-i 
Zt-CCCri 

Zjmc imger v^- xo-^-x j-v^-a j -ii ivjjc 

2.8e-10 

41.7 

j40 

LJorC 

uuai specuiciiy pnobpii<tui5c, 
catalytic doma 

2.4e-40 

147.4 

547 

HMG_CoA_synt 

Hydroxymethylglutaryl-coenzyme A 
synthas 

0 

1250.8 

549 

lam in in G 

Laminin G domain 

3.3e-76 

266.6 

551 

PHD 

PHD-fmger 

0.008 

9.3 

552 

PDZ 

PDZ domain (Also known as DHR or 

0.0017 

25.0 
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GLGF). 



555 

WW 

WW domain 

1 .3e-24 

95.3 

558 

kinesin 

Kinesin motor domain 

1.8e-176 

599.7 

559 

zf-C3HC4 

Zinc linger, C3HC4 type (RING 
fmger) 

0.00085 

16.5 

563 

efhand 

EF hand 

7.9e-ll 

49.4 

567 

PH 

PH domain 

7.8e-06 

25.9 

568 

PH 

PH domain 

3.1e-39 

143.8 

569 

Hist deacetyl 

Histone deacetylase family 

5.2e-106 

365.6 

570 

PDZ 

PDZ domain (Also known as DHR or 
GLGF). 

3.4e-20 

80.5 

571 

zf-C3HC4 

Zinc finger, C3HC4 type (RING 
finger) 

le-16 

58.5 

573 

ubiquitin 

Ubiquitin family 

1.4e-08 

31.1 

574 

FH2 

Form in Homology 2 Domain 

1.3e-l 10 

380.9 

576 

serpin 

Serpins (serine protease inhibitors) 

4.3e-146 

496.4 

579 

zf-C2H2 

Zinc finger, C2H2 type 

5.7e-76 

265.8 

580 

pkinase 

Eukaryotic protein kinase domain 

6.9e-79 

275.5 

581 

RhoGAP 

RhoCiAP domain 

4.4e-53 

189.8 

582 

Ribosomal L7A 
e 

Ribosomal nrotein I 7Ae 

JLVll/V/JUliiUI |_/1 yj lv ill *_/ / Aw 

0.028 

1.0 

584 

kazal 

Kazal-type serine protease inhibitor 
domain 

2.2e-52 

187.4 

585 

LRR 

Leucine Rich Repeat 

4.4e-28 

306.7 

586 

PHD 

PHD-finper 

3.8e-12 

53.8 

588 

GTP1 ORG 

V_l Ail v_/ _LJ V_J 

GTP1/OBG familv 

V J L X 1( WJJVJ lUlllllY 

1 1 e-62 

215.2 

590 

Collagen 

C > n\\p\QP k Ti trinlp hpfiv rpnp^t (00 

copies) 

8e-42 

152.4 

591 

lvs 

C-tvr>e Iv^oTvirie/alnha-lactalbiirnin 
family 

1.6e-31 

116.4 

596 

ACBP 

Acyl Co A binding protein 

0.0022 

-9.4 

597 

SNF2 N 

SNF2 and others N-termina! domain 

3.7e-98 

339.5 

600 

KRAB 

KRAB box 

1.3e-29 

111.8 

606 

LRR 

Leucine Rich Repeat 

le-05 

32.5 

607 

LRR 

Leucine Rich Repeat 

le-05 

32.5 

608 

WD40 

WD domain, G-beta repeat 

5.3e-23 

89.8 

610 

cpn60JTCPl 

TCP-l/cpn60 chaperonin family 

1.7e-237 

802.4 

613 

THF DHG CY 
H 

Tetrahydrofolate 
dehydrogenase/cyclohydro 

4.9e-173 

588.3 

617 

rrm 

RNA recognition motif. 

4e-14 

60.4 

618 

rnri 

RNA recognition motif. 

4e-14 

60.4 

620 

cofilin__ADF 

Co filin/tropomyos in-type actin- 
binding pr 

3e-06 

34.2 

621 

Nop 

Putative snoRNA binding domain 

6.1e-95 

328.8 

622 

UCH-2 

Ubiquitin carboxyl-terminal 
hydrolase family 

5.8e-21 

83.1 

625 

zf-C2H2 

Zinc finger, C2H2 type 

2.5e-124 

426.4 

628 

DEAD 

DEAD/DEAH box helicase 

2.5e-68 

219.0 

632 

GST 

Glutathione S-transferases. 

4.8e-26 

89.0 

633 

5 nucleotidase 

5-nucleotidase 

6.6e-248 

837.0 

636 

LIM 

LIM domain containing proteins 

i.6e-88 

307.5 

637 

pkinase 

Eukaryotic protein kinase domain 

1.5e-73 

257.8 

638 

MSP domain 

MSP (Major sperm protein) domain 

8.4e-09 

42.7 

639 

raetalthio 

Metallothionein 

2e-24 

94.6 

641 

zf-C2H2 

Zinc finger, C2H2 type 

6.1e-114 

391.9 

642 

RibosomaI_S28e 

Ribosomal protein S28e 

9.3e-48 

172.1 

643 

Ribosomal S5 

Ribosomal protein S5 

8.3e-87 

301.8 

646 

PHD 

PHD-finger 

0.00025 

23.1 

647 

WD40 | WD domain, G-beta repeat 

1.5e-22 

88.4 
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SCORE 

648 

LipaseGDSL 

Lipase/Acylhydrolase with GDSL- 

0.015 

2.2 



like motif 



652 

zf-C2H2 

Zinc finger, C2H2 type 

4.1e-i46 

498.8 

653 

histone 

Core histone H2A/H2B/H3/H4 

L2e-10 

48.8 

654 

zf-C2H2 

Zinc finger, C2H2 type 

1.9e-87 

303.9 

655 

ras 

Ras family 

6.4e-77 

269.0 

657 

zf-C3HC4 

Zinc finger, C3HC4 type (RING 

5.3e-13 

46.4 



finger) 



658 

STphosphatase 

Ser/Thr protein phosphatase 

2.6e~182 

619.1 

659 

zf-C2H2 

Zinc finger, C2H2 type 

I.3e-92 

321.1 

660 

zf-C2H2 

Zinc finger, C2H2 type 

L5e-85 

297.6 

662 

NDK 

Nucleoside diphosphate kinases 

1.4e-119 

410.7 

664 

IRF 

Interferon regulatory factor 

7e-20 

79.5 



transcription f 



665 

4HPPD_C 

4-hydroxyphenyIpyruvate 

1.4e-16 

68.5 



dioxygenase C term 



666 

DEAD 

DEAD/DEAH box helicase 

4.8e-74 

237.1 

667 

DEAD 

DEAD/DEAH box helicase 

2.9e-70 

225.1 

669 

pkinase 

Eukaryotic protein kinase domain 

6.1e-93 

322.2 

671 

homeobox 

Homeobox domain 

0.018 

16.5 

678 

crystall 

Beta/Gamma crystallin 

4.7e-106 

365.8 

679 

WD40 

WD domain, G-beta repeat 

l.9e-06 

34.9 

680 

Keratin B2 

Keratin, high sulfur B2 protein 

4.1e-06 

15.9 

682 

G-gamma 

GGL domain 

8.5e-33 

117.9 

685 

UCH-2 

Ubiquitin carboxyi-terminai 

1.4e-29 

IU.7 



hydrolase family 



686 

Acetyltransf 

Acetyltransferase (GNAT) family 

6.6e-10 

46.4 

687 

7tm_l 

7 transmembrane receptor (rhodopsin 

4.6e-15 

50.0 



family) 



688 

proteasome 

Proteasome A-type and B-type 

6.5e-64 

225.7 

689 

SCP2 

SCP-2 sterol transfer family 

6.2e-37 

136.1 

690 

TS-N 

TS-N domain 

0.041 

20.1 

692 

zf-C2H2 

Zinc finger, C2H2 type 

9.9e-60 

211.9 

693 

zf-MYND 

MYND finger 

0.038 j 

5.5 

694 

Oxysterol BP 

Oxysterol-binding protein 

3.9e-133 

455.7 

695 

PDZ 

PDZ domain (Also known as DHR or 

1.3e-30 

115.1 



GLGF). 



703 

Peptidase_C2 

Calpain family cysteine protease 

2.3e-175 

596.0 

706 

filament 

Intermediate filament proteins 

7.2e-107 

368.5 

710 

fibrinogen^ 

Fibrinogen beta and gamma chains, 

7e-80 

278.0 



C-term 



711 

SH2 

Src homology domain 2 

2.3e-65 

192.1 

712 

ATP-synt DE 

ATP synthase, Delta/Epsilon chain 

0.00062 

19.0 

713 

ARID 

ARID DNA binding domain 

2e-17 

7L3 

714 

LBP BPI CETP 

J^BP/ BPI /CETP family 

8.6e-34 

125.7 

715 

RNA_pol_L 

RNA polymerases L / 13 to 16 kDa 

4.8e-49 

176.3 



subunit 



716 

KRAB 

KRAB box 

1.3e-42 

155.0 

717 

mito carr 

Mitochondrial carrier proteins 

4.8e-38 

133.3 

719 

Gal-bindjectin 

Vertebrate galactoside-binding lectin 

1.5e-25 

90.2 

726 

aldedh 

Aldehyde dehydrogenase family 

1.3e-119 

410.8 

728 

G lycos transf 2 

Glycosyl transferases 

4e-21 

83.6 

734 

ELM2 

ELM2 domain 

2e-34 

127.8 

735 

PR55 

Protein phosphatase 2A regulatory 

0 

1038.2 



subunit PR 



737 

DSPc 

Dual specificity phosphatase, 

4e-14 

60.4 



catalytic doma 



740 

WD40 

WD domain, G-beta repeat 

5.6e-14 

59.9 

745 

zf-C3HC4 

Zinc finger, C3HC4 type (RING 

3.8e-13 

46.9 
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finger) 



749 

mito carr 

Mitochondrial carrier proteins 

4.5e-67 

232.8 

750 

DUF27 

Domain of unknown function DUF27 

4.5e-12 

53.5 

751 

SH3 

SH3 domain 

3.6e-17 

70.5 

752 

HMG box 

HMG (high mobility group) box 

8.6e-13 

55.9 

753 

SPRY 

SPRY domain 

5.9e-05 

23.3 

754 

GTP_CDC 

Cell division protein 

7.5e-153 

521.2 

755 

mito carr 

Mitochondrial carrier proteins 

3e-88 

305.4 

756 

TSPN 

Thrombospondin N-terminal -like 
domains 

8.1e-58 

205.5 

757 

BTB 

BTB/POZ domain 

5.7e-23 

89.7 

759 

zf-C2H2 ' 

Zinc finger, C2H2 type 

1.2e-12 

55.4 

760 

NSF 

NSF attachment protein 

6.4e-127 

435.1 

762 

Ribosomal_S14 

Ribosomal protein S14p/S29e 

2.1e-06 

24.8 

765 

Th IF family 

ThiF family 

1.7e-39 

144.6 

766 

DnaJ 

DnaJ domain 

3.9e-36 

133.5 

768 

tRNA-synt_2b 

tRNA synthetase class II 

9.1e-81 

281.7 

769 

ldl_recept_a 

Low-density lipoprotein receptor 
domain 

0 

1404.5 

770 

WD40 

WD domain, G-beta repeat 

2e-21 

84.6 

771 

LRR 

Leucine Rich Repeat 

3.8e-06 

33.9 

774 

SNF2 N 

SNF2 and others N-terminal domain 

5.5e-99 

342.3 

776 

VPS9 

Vacuolar sorting protein 9 (VPS9) 
domain 

l.le-30 

115.4 

111 

VPS9 

Vacuolar sorting protein 9 (VPS9) 
domain 

l.le-30 

115.4 

778 

VPS9 

Vacuolar sorting protein 9 (VPS9) 
domain 

Ue-30 

115.4 

779 

zf-C3HC4 

Zinc finger, C3HC4 type (RING 
finger) 

3.Ie-08 

31.0 

781 

cadherin 

Cadherin domain 

5.6e-113 

388.7 

783 

HECT 

HECT-domain (ubiquitin- 
transferase). 

4.2e-31 

116.8 

785 

sushi 

Sushi domain (SCR repeat) 

1.8e-60 

214.3 

786 

sushi 

Sushi domain (SCR repeat) 

1.8e-60 

214.3 

788 

vwa 

von Willebrand factor type A domain 

1.9e-52 

187.7 

790 

rrm 

RNA recognition motif. 

2.8e-20 

80.8 

791 

Collagen 

Collagen triple helix repeat (20 
copies) 

0.00097 

9.7 

792 

pkinase 

Eukaryotic protein kinase domain 

0.023 

12.4 

795 

zf-C2H2 

Zinc finger, C2H2 type 

6.5e-95 

328.7 

796 

adh short 

short chain dehydrogenase 

4.1e-05 

-7.3 

799 

SAICAR synt 

SAICAR synthetase 

6e-125 

428.5 

805 

WD40 

WD domain, G-beta repeat 

4e-65 

229.8 

806 

ZU5 

ZU5 domain 

4.7e-37 

136.5 

807 

WD40 

WD domain, G-beta repeat 

0.016 

21.8 

808 

WD40 

WD domain, G-beta repeat 

0.0041 

23.8 

809 

pkinase 

Eukaryotic protein kinase domain 

2e-31 

117.2 

810 

vwa 

von Willebrand factor type A domain 

1.9e-52 

187.7 

814 

zf-C2H2 

Zinc finger, C2H2 type 

4.5e-83 

289.4 

815 

zf-C2H2 

Zinc finger, C2H2 type 

6e-74 

259.1 

817 

myosin head 

Myosin head (motor domain) 

1.5e-176 

599.9 

818 

GSPII_E 

Bacterial type II secretion system 
protein 

0.012 

11.5 

819 

PDEase 

3'5'-cyclic nucleotide 
phosphodiesterase 

l.le-74 

215.5 

821 

PH 

PH domain 

0.00025 

20.5 

822 

CNH 

CNH domain 

0.00015 

-24.7 

827 

rrm 

RNA recognition motif. 1 .5e-06 

35.2 
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829 

HMG_box 

HMG (high mobility group) box 

7.8e-34 

125.8 

830 

RasGEF 

RasGEF domain 

2.2e~102 

353.5 

831 

CNH 

CNH domain 

3e-U8 

406.2 

832 

mito carr 

Mitochondrial carrier proteins 

3.7e-37 

130.3 

833 

PX 

PX domain 

2.7e-19 

77.5 

837 

Y_phosphatase 

Protein-tyrosine phosphatase 

1.6e-263 

888.8 

838 

ank 

Ank repeat 

2.4e-270 

911.5 

840 

ank 

Ank repeat 

5.8e-38 

139.6 

842 

Ribosomal L15e 

Ribosomal LI 5 

4.8e-131 

448.8 

843 

SNF 

Sodiumrneurotransmitter symporter 
family 

0 

1201.8 

845 

Peptidase M16 

Insulinase (Peptidase family Ml 6) 

4.7e~67 

236.2 

848 

EF1BD 

EF-1 guanine nucleotide exchange 
domain 

2.2e-56 

200.7 

849 

zf-C2H2 

Zinc finger, C2H2 type 

1.5e-122 

420.5 

850 

zf-C2H2 

Zinc finger, C2H2 type 

2e-67 

237.4 

852 

SIS 

SIS domain 

3.8e-30 

113.6 

853 

RhoGAP 

RhoGAP domain 

l.le-37 

138.6 

854 

PDZ 

PDZ domain (Also known as DHR or 
GLGF). 

5.1e-10 

46.7 

856 

ACOX 

Acyl-CoA oxidase 

9.ie-263 

886.3 

858 

efhand 

EF hand 

2.4e-18 

74.4 

860 

homeobox 

Homeobox domain 

4e-22 

86.9 

862 

TFIIF_beta 

Transcription initiation factor IJF, 
beta 

2.2e-l34 

459.8 

866 

A2M 

Alpha-2-macroglobulin family 

4.9e-21 

70.9 

867 

MoCFJ>iosynth 

Molybdenum co factor biosynthesis 
protei 

5.8e-205 

694.3 

868 

EGF 

EGF-like domain 

4.1e-22 

86.9 

869 

EGF 

EGF-like domain 

Lle-22 

88.8 

871 

PI-PLC-X 

Phosphatidylinositol-specific 
phospho lipase 

7.2e-95 

328.6 

872 

UCH-2 

Ubiquitin carboxyl-terminal 
hydrolase family 

l.le-20 

82.1 

874 

SH3 

SH3 domain 

2.2e-14 

61.2 

877 

SH3 

SH3 domain 

8.6e-90 

311.7 

882 

KRAB 

KRAB box 

6.9e-45 

162.6 

885 

ank 

Ank repeat 

7.1e-07 

36.3 

886 

biopterin_H 

Biopterin-dependent aromatic amino 
acidh 

0 

988.3 

887 

OTP EFTU 

Elongation factor Tu family 

4.9e-129 

437.5 

888 

zf-C3HC4 

Zinc finger, C3HC4 type (RING 
ringer) 

1.6e-14 

51.4 

889 

zf-C2H2 

Zinc finger, C2H2 type 

3.7e-92 

319.6 

890 


Immunoglobulin domain 

3.8e-06 

24.8 

892 

PTR2 

POT family 

9.5e-48 

163.0 

893 

Sulfatase 

Sulfatase 

3.5e-78 

273.2 

894 

Sulfatase 

Sulfatase 

3.5e-78 

273.2 

895 

7tmJ 

7 transmembrane receptor (rhodopsin 
family) 

4.5e-51 

164.4 

896 

Glyco_hydro_3 1 

Glycosyl hydrolases family 31 

0 

1277.3 

897 

chromo 

'chromo 1 (CHRromatin Organization 
Modifier) 

3.9e-06 

26.0 

898 

Cbl_N 

CBL proto-oncogene N-terminal 
domain 

1.2e-273 

922.4 

899 

vwa 

von Willebrand factor type A domain 

5.5e-32 

119.7 

900 

WD40 

WD domain, G-beta repeat 

2.7e-07 

37.7 

901 

zf-C2H2 

Zinc finger, C2H2 type 

4e-156 

532.1 

903 

ras 

Ras family 

6.6e-101 

348.6 
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SEQ ID 

NO: 

PFAM NAME 

DESCRIPTION 

p-value 

PFAM 
SCORE 

904 

Armadillo seg 

Armadillo/beta-caten in-like repeats 

I.le-06 

35.6 

906 

FH2 

Form in Homology 2 Domain 

4.5e-112 

385.7 

907 

Cytidylyltransf 

Cytidylyltransferase 

1.4e-05 

29.3 

908 

pkinase 

Eukaryotic protein kinase domain 

1.2e-64 

228.2 

909 

pkinase 

Eukaryotic protein kinase domain 

8.5e-70 

245.3 

910 

pkinase 

Eukaryotic protein kinase domain 

2.9e-42 

153.8 

911 

pkinase 

Eukaryotic protein kinase domain 

1.2e-35 

131.8 

912 

PHD 

PHD-finger 

5.1e-06 

33.4 

913 

PHD 

PHD-finger 

5.5e-16 

66.5 

916 

filament 

Intermediate filament proteins 

9.7e-121 

414.5 

917 

LIM 

LIM domain containing proteins 

5.9e-l5 

57.9 

918 

SAM 

SAM domain (Sterile alpha motiQ 

4.3e-16 

66.9 

922 

Acylphosphatase 

Acylphosphatase 

2.9e-63 

223.6 

924 

ig 

Immunoglobulin domain 

L3e-08 

32.8 

925 

Acyl-CoA dh 

Acyl-CoA dehydrogenase 

2.4e-131 

449.8 

927 

7tm_l 

7 transmembrane receptor (rhodopsin 
family) 

2.9e-45 

145.9 

928 

globin 

Globin 

2.4e-52 

186.9 

929 

sugartr 

Sugar (and other) transporter 

L2e-16 

68.8 

932 

Collagen 

Collagen triple helix repeat (20 
copies) 

0.00097 

9.7 

933 

HMG box 

HMG (high mobility group) box 

7.8e-34 

125.8 

934 

SEA 

SEA domain 

0.0021 

24.7 

935 

ras 

Ras family 

6.4e-59 

209.2 

936 

CH 

Calponin homology (CH) domain 

3.8e-21 

83.7 

937 

voltageCLC 

Voltage gated chloride channels 

1.9e-199 

676.0 

938 

homeobox 

Homeobox domain 

1.9e-25 

98.0 

940 

pkinase 

Eukaryotic protein kinase domain 

9.9e-58 

205.2 

942 

Myosin tail 

Myosin tail 

3.7e-09 

38.2 

943 

zf-C2H2 

Zinc finger, C2H2 type 

2.2e-92 

320.3 

945 

Clat_adaptor_s 

Clathrin adaptor complex small chain 

l.3e-76 

268.0 

946 

sugar tr 

Sugar (and other) transporter 

0.017 

-122.8 

947 

tRNA-synt le 

tRNA synthetases class I (C) 

0.00097 

15.6 

948 

PHD 

PHD-finger 

2.2e-l7 

71.2 

951 

sugar_tr 

Sugar (and other) transporter 

0.0082 

-113.9 

952 

mito_carr 

Mitochondrial carrier proteins 

1.7e-54 

189.7 

953 

mybJDNA- 
binding 

Myb-like DNA-binding domain 

4.5e-20 

80.1 

955 

ketoacyl-synt 

Beta-ketoacyl synthase 

7.1e-133 

454.8 

957 

aldo ket red 

Aldo/keto reductase family 

1.5e-98 

340.8 

959 

Kelch 

Kelch motif 

0.02 

20.8 

961 

ras 

Ras family 

2.2e-29 

111.1 

964 

homeobox 

Homeobox domain 

5.4e-22 

86.5 

965 

PH 

PH domain 

3e-21 

80.9 

966 

zf-C3HC4 

Zinc finger, C3HC4 type (RING 
finger) 

2.2e-09 

34.7 

967 

Ribosomal_L29 

Ribosomal L29 protein 

1.6e-15 

65.0 

970 

FAD_binding_2 

FAD binding domain 

8.9e-47 

166.6 

971 

rve 

Integrase core domain 

0.00015 

19.8 

972 

Glycostransf _2 

Glycosyl transferases 

2.1e-21 

84.5 

974 

Ribosomal L10 

Ribosomal protein L10 

3.3e-48 

173.6 

975 

7tm_l 

7 transmembrane receptor (rhodopsin 
family) 

1.6e-37 

121.3 

976 

zf-C4 

Zinc finger, C4 type (two domains) 

2.1e-52 

178.5 

977 

zf-C2H2 

Zinc finger, C2H2 type 

6.6e-150 

511.4 

978 

FTHFS 

Formate-tetrahydrofolate Hgase 

0 

1367.2 

982 

Renal_dipeptase 

Renal dipeptidase 

1.3e-73 

258.0 

984 

Adeaminase 

Adenosine/ AMP deaminase 

2.6e-05 | 

-48.6 
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TABLE 5 


SEQ ID NO: 
of full-length 
nucleotide 
sequence 

SEQ ID 
rivji oi 
full-fpncffh 

J « If lv 1 J hi Hi 

peptide 
sequence 

SEQ ID NO: 
of con tig 

mif Ipnfidp 

11 UUwtlUv 

sequence 

SEQ ID NO: 
of contig 

sequence 

Priority docket 
number__correspondin 
p SFO ID NO- in 
priority application 

SEQ ID NO: in 

i 

985 

1969 

2953 

787C1P2 1 

150 

2 

986 

1970 

2954 

787CIP2__2 

223 

3 

987 

1971 

2955 

787CIP2_3 

1884 

4 

988 

1972 

2956 

787CIP2_4 

2123 

5 

989 

1973 

2957 

787CIP2_5 

2313 

6 

990 

1974 

2958 

787CIP2_6 

3284 

7 

991 

1975 

2959 

787CIP2J7 

3324 

8 

992 

1976 

2960 

787CIP2 8 

6182 

9 

993 

1977 

2961 

787CIP2_9 

6210 

10 

994 

1978 

2962 

787C1P2J0 

6213 

11 

995 

1979 

2963 

787CIP2 11 

6257 

12 

996 

1980 

2964 

787CIP2J2 

6294 

13 

997 

1981 

2965 

787CIP2J3 

6294 

14 

998 

1982 

2966 

787CIP2J4 

6330 

15 

999 

1983 

2967 

787CIP2J5 

6364 

16 

1000 

1984 

2968 

787CIP2 16 

6455 

17 

1001 

1985 

2969 

787CIP2J7 

6486 

18 

1002 

1986 

2970 

787CIP2J8 

6503 

19 

1003 

1987 

2971 

787CIP2J9 

6528 

20 

1004 

1988 

2972 

787CIP2_20 

6572 

21 

1005 

1989 

2973 

787CIP221 

6578 

22 

1006 

1990 

2974 

787CIP2_22 

6593 

23 

1007 

1991 

2975 

787CIP2 23 

6603 

24 

1008 

1992 

2976 

787CIP224 

6603 

25 

1009 

1993 

2977 

787CIP2J25 

6679 

26 

1010 

1994 

2978 

787CIP2 26 

6744 

27 

1011 

1995 

2979 

787CIP2J27 

6762 

28 

1012 

1996 

2980 

787CIP2_28 

6770 

29 

1013 

1997 

2981 

787CIP2J29 

6770 

30 

1014 

1998 

2982 

787CIP2_30 

6787 

31 

1015 

1999 

2983 

787CIP2J31 

6858 

32 

1016 

2000 

2984 

787CIP2_32 

6866 

33 

1017 

2001 

2985 

787CEP2J3 

6938 

34 

1018 

2002 

2986 

787CIP2__34 

6938 

35 

1019 

2003 

2987 

787CIP2_35 

6977 

36 

1020 

2004 

2988 

787CIP2_36 

7001 

37 

1021 

2005 

2989 

787CIP2 37 

7002 

38 

1022 

2006 

2990 

787CIP2 38 

7004 

39 

1023 

2007 

2991 

787CIP2_39 

7005 

40 

1024 

2008 

2992 

787CIP2_40 

7006 

41 

1025 

2009 

2993 

787CIP2_41 

7008 

42 

1026 

2010 

2994 

787CIP2__42 

7014 

43 

1027 

2011 

2995 

787CIP2_43 

7021 

44 

1028 

2012 

2996 

787CIP2 44 

7022 

45 

3029 

2013 

2997 

787CIP2 46 

7057 

46 

1030 

2014 

2998 

787CIP2_47 

7058 

47 

1031 

2015 

2999 

787CIP2_49 

7088 

48 

1032 

2016 

3000 

787CIP2_50 

7089 

49 

1033 

2017 

3001 

787CIP2_51 

7182 

50 

1034 

2018 

3002 

787CIP2_52 

7489 

51 

1035 

2019 

3003 

787CBP2__53 

7564 

52 

1036 

2020 

3004 

787CIP2_54 

7566 

53 

1037 

2021 

3005 

787CIP2_55 

7587 


197 


WO 01/57190 


PCT/US01/04098 


S4 

I 1 018 

7097 
zuzz 

1AA6 
jUUO 

fo /v^lJrz_JD 

/ jy 1 


1010 

9091 

ZUZJ 

1007 
JUU / 

987r , FP9 S7 
/O /CirZ_J / 

/ouu 

S6 

JO 

1040 

I UHU 

9094 

ZUZH 

1008 
JUU5 

787P1P7 S8 
/O/v^irZ Jo 

7£fl4 
/OUH 

S7 
j / 

1041 

909 S 

ZUZJ 

lOOO 

juuy 

787P1P9 SO 
/O/V^lrZ J!? 

7^ 19 
/0 I Z 

sx 

JO 

1047 

X U*rZ 

9096 
zuzu 

1010 

787P1P9 60 

16 1 1 
/0 I J 

SO 
j^ 

1041 

9097 
zuz / 

101 1 
JU \ I 

787P1P9 61 
/o / v^irZ_Ol 

7A 1 S 

60 

1044 

9098 
zuzo 

1019 
JU1Z 

787P1P9 69 
/ o /v^irz_OZ 

7A ?A 

/o 10 

61 

104S 

9090 
zuzy 

1011 
JU 1 J 

/o /v^irz_Oj 

7£ 1 7 
/Ol / 

6? 

1046 

9010 

1014 
JU I 1 * 

787P1P9 64 
/o / v_wrz Of 

7A91 
/OZj 

63 

1047 

901 1 

ZUJ 1 

101 S 
JU I J 

787P1P9 6S 
/ 0 / V-/JLx Z OJ 

7/^9 S 
/OZJ 

64 

104X 
1 uto 

9019 

ZUJZ 

1016 
->U I o 

787P1P9 6A 
/ o / v^lJrZ OO 

7A9S 
/OZJ 

65 

1040 

9011 

- ZUJ J 

1017 
jUI / 

787riP9 ^7 
/ o /v^lr Z O/ 

/OjU 

66 

1050 

9014 

101 8 
JU 1 o 

787P1P9 68 
/ O / V^li z uo 

7^18 
/Ojo 

67 

1051 

901S 

j6U J J 

1010 

787P1P9 60 

/ O / ^/Ui U7 

7^40 
/OfU 

68 

10S9 

9016 

ZUJO 

1090 
jUZU 

7R7PTP9 7f| 
/ o / v*JLr Z / U 

7^7A 
/0/U 


10S1 

1 UJJ 

9017 

ZUJ / 

1091 
JUZ 1 

787P1P9 71 

7£7£ 
/0 /O 

/ u 

10S4 

I UJH 

9018 
ZUJO 

1099 

JUZZ 

/0/v-.Jr^Z /Z 

/UOO 

71 
/ 1 

10SS 

i uj j 

9Aio 

ZUJ7 

1A71 
jUZj 

7Q7r , TP9 71 

/O / V^ljiZ / J 

o/ion 

/oyu 

7? 

10S6 

1 UJO 

Of) AH 
ZuhU 

1A7/1 

7C7P1P7 7/1 
/o /v^lrz_/4 

77/in 
/ /UU 

71 
/ j 

10S7 
IUJ I 

9041 
Zuf 1 

109 S 
jUZj 

787r , IP7 7S 
/O / v^lJr Z / J 

777/1 

/ / /4 

74 

10S8 

90/19 
ZUfZ 

1096 
jUZO 

787r , TPO 7A 
/O /v^LrZ_/0 

770/1 

/ /o4 

7S 

10SQ 
I Ujy 

7 A/1 3 

1 A77 
JUZ / 

7C7PTP9 77 

/0 /dJrz_/ / 

/ /OJ 

76 

1060 
luou 

9044 

1A7C 
jUzo 

797r'TPO 7Q 

/o /L,Lrz_/o 

77QO 

/ /yz 


10£1 
IUO 1 

7 A/I S 
ZU4J 

moo 
jUzy 

7Q7/~*TPO 7Q 

/o /Cirz_/y 

T7AO 

//yo 

78 
to 

1 0A9 
1UOZ 

on/1/: 
ZU40 

jUjU 

7Q7PTP7 Cft 

/oU/ 

70 

10£1 

OA/17 

zu4 / 

im i 
jUj 1 

/o f{s[rZ_ol 

7olU 

80 

1064 
lUOf 

zu^o 

mio 
jUjz 

707r*TPO CO 

/o/Ulrz oz 

Volz 

o 1 

i a/?s 
1U0J 

zu^y 

'j nil 
JUjj 

TOO /TDO Ol 

76 /Ulrz oj 

7516 

89 
oz 

IVOO 

often 
zUjU 

jUj4 

/o/Cirz_o4 

too/: 
7ozo 

8i 

IUO / 

one i 

ZUJ 1 

jUJj 

/o/UlKz oj 

7842 

84 

1068 
1 UOo 

OftCO 

ZUjz 

JUJO 

OQO/^TPO Q/C 

7o5U 

8S 

1 060 
J uoy 

on^T 

ZUJJ 

i Ai7 
juj / 

7Q7/""*TPO QO 

TOZ^ 
/OOJ 

86 

OU 

1 070 
1U /U 

9AS4 
Zuj^ 

1A1C 

jUjo 

7C7PIPO CQ 

/o / L/lrz_oo 

TOOT 
/OOZ 

87 
o / 

1071 
IU/ 1 

9A SS 
ZUJJ 

1A10 

jUjy 

787/TP9 

/o /Oirz_oy 

7CO 1 

toy 1 

oo 

1077 
iU/Z 

9AS6 
ZUjO 

JU4U 

7Q7/^TP7 OH 

/o/ULrz VU 

TOOO 

/©yz 

89 
o^ 

1071 

IV/ / J 

90S7 
ZU j / 

10/11 
JUH 1 

787r*TP9 Ol 

/o/v^irZ y»l 

/oyo 


1074 
1U /*+ 

90S8 
ZUjo 

1/1/19 
JU4Z 

787P1P9 07 

/o / v^lrz_yz 

/oyo 


107S 

1 U / J 

90S0 

ZUJ7 

jU4j 

787PTP9 Ql 
/o /L/lrZ_yj 

70A7 

/yu/ 

99 

1076 
i u /u 

9060 
zuou 

10/1/1 
JU4H 

787r , n>9 OA 

/ 0 / L4rz_y4 

701 1 
/y 1 j 

93 

1077 

IV/// 

9061 

ZUO I 

10/1S 

787r , lP9 QS 
/o/v_,Lrz yj 

701 /I 

/y if ! 

94 

1078 

1 V/ / o 

9069 

ZUUZ 

1046 

787P1P9 0/; 
/ 0 / l^lrZ_yo 

701 s 
/y 1 j 

95 

1079 
l u / y 

9061 

ZUUJ 

1047 

787PTP9 07 

/ 0 / v-/JUrz y / 

7Q7A 

/yzu 

96 

10X0 

IV/ ow 

9064 

ZUU*t 

1048 

787PTP9 08 

7Q91 1 

/yz 1 

97 

10X1 

906S 

ZUUJ 

1040 

787PTP9 OQ 

/ 0 / i^iJrz yy 

7074 

98 

1082 

9066 

10S0 

JUJU 

787P1P9 1 0A 

lot V-/1F Z 1 UU 

7097 

/yz / 

99 

1083 

9067 

Z.V/U / 

10S1 

JUJ l 

787PIP9 101 

709 Q 

/yzy 

100 

10X4 

1 V/0*T 

9068 
zuuo 

10S9 

JUJZ 

787PTP9 109 
lot V/lrZ 1 UZ 

7017 

/yj / 

101 

108S 

IUO J 

9060 

ZUU7 

jUjj 

787P1P9 1 fH 
/o/UlrZ luj 

70/t A 

/y4u 


10X6 

1UOU 

9070 
ZU /u 

10S4 

787PIP7 

/ 0 / v>LrZ 1 U4 

70/10 

/y4z 

103 

10X7 
1 uo / 

9071 

ZU / I 

10SS 
jUjj 

787PTP9 lOS 
/ 0 / l^lrZ_ J U J 

70/14 

/ynn 

104 

10XX 

1 UOO 

9079 
ZU /z 

ios/=; 

jUjO 

787PTP9 10A 
/o / lwIrz__lUO 

70S 1 

/yj 1 

10S 

10X0 

1U07 

9071 

ZU / J 

10S7 
jUj / 

787PTP9 1H7 

10 /^Irz_lU/ 

70S 1 

/yj 1 

106 

1 000 

9074 
ZU /*+ 

ins8 

jUjo 

787PTP7 1AR 

/o/i^lrz_lUo 

7QA7 

/yoz 

107 
iu / 

1001 
ivy i 

7A7S 

ZU / J 

jUjy 

787PTP7 1 AO 

/o /Ulrz_iuy 

/yo4 

108 

1092 

2076 

3060 

787CIP2 110 

tot Vxli x> 1 1 u 

7977 

/ y 1 / 

109 

1093 

2077 

3061 

787C1P2J11 

7978 

no 

1094 

2078 

3062 

787CEP2 112 

7980 

111 

1095 

2079 

3063 

787CIP2JI3 

7982 

112 

1096 

2080 

3064 

787CIP2J14 

8000 

113 

1097 

2081 

3065 

787CIP2J15 

8003 
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114 

1098 

2082 

3066 

787CIP2 116 

ouU4 

115 

1099 

2083 

3067 

787CIP21I7 

oUU / 

1 16 

1 100 

2084 

3068 

787CIP2 118 

onno i 

oUUo 

117 

1101 

2085 

3069 

7o/CIP2_l 19 

OAAQ 

118 

1102 

2086 

3070 

TOTPim lOA 

787C1P2 120 

cni o 
oUlJ 

1 19 

1103 

2087 

3071 

OOO/^TTIO 111 

787CiP2_lzl 

oUl / 

120 

1104 

2088 

3072 

107^1 no m 
/o7ClPz_lZZ 

0A1 0 

oUlo 

121 

1 105 

2089 

3073 

OQOf^TnO 1 T3 

/o/LaPz_lZ3 

0A7 1 

oUZl 

122 

1106 

2090 

3074 

'707PTm lO/t 

787CLPz 124 

oUZZ 

123 

11 07 

2091 

3075 

7o7Clrz_izj 


124 

1108 

2092 

3076 

ooo/""ino 1 o c 

787Clrz_lzo 

oUZj 

125 

1109 

2093 

3077 

O 0'7/~ , 7 nO IOO 

7o7(_irz_lz7 t 

oUZ4 

126 

1110 

2094 

3078 

TOlPim lOQ 

7o7LJLPz_lzo 

oUZO 

127 

1111 

2095 

3079 

ToirriD ion 
7o/ClPz_lzv 

0A70 

oUZo 

128 

1112 

2096 

3080 

O0O/"MT>0 11A 

/o/CLPz__lJU 

QAO/r 

oUJO 

129 

1113 

2097 

3081 

787C1P2 131 

OAOO 

OUJO 

130 

1114 

2098 

3082 

787ClPz_l 51 

OA/1 C 

oU4j 

131 

1115 

2099 

3083 

OOO/^ITIO \11 

787Clrz 133 

OA/1 C 1 

oU4j 

132 

1116 

2100 

3084 

787CIP2134 

OA/1 O 

8U4o 

133 

1117 

2101 

3085 

787C1P2_135 

OA/tO 

oU4o 

134 

1118 

2102 

3086 

787CIP2_13o 

QACO 

oUjz 

135 

1119 

2103 

3087 

787C1P2_137 

©AC3 

oUjj 

136 

1120 

2104 

3088 

787C1P2_138 

oner 

oUjj 

137 

1121 

2105 

3089 

787CIP2_139 

8059 

138 

1122 

2106 

3090 

787CIP2_140 

on/: i 

8U61 

139 

1123 

2107 

3091 
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6217 

415 

1399 

2383 

3367 

787CIP2B 64 

6220 

416 

1400 

2384 

3368 

787CIP2B 65 

6221 

417 

1401 

2385 

3369 

787CIP2B_66 

6222 

418 

1402 

2386 

3370 

787CIP2B 67 

6223 

419 

1403 

2387 

3371 

787CIP2B_68 

6223 

420 

1404 

2388 

3372 

787CIP2B_69 

6226 

421 

1405 

2389 

3373 

787CIP2B_70 

6227 

422 

1406 

2390 

3374 

787CIP2B 71 

6229 

423 

1407 

2391 

3375 

787CIP2B 72 

6248 

424 

1408 

2392 

3376 

787C1P2B 73 

6260 

425 

1409 

2393 

3377 

787CIP2B 74 

6264 

426 

1410 

2394 

3378 

787CIP2B 75 

6269 

427 

1411 

2395 

3379 

787CIP2B_76 

6269 

428 

1412 

2396 

3380 

787CIP2B 77 

6275 

429 

1413 

2397 

3381 

787CIP2B 78 

6276 

430 

1414 

2398 

3382 

787CIP2B 79 

6280 

431 

1415 

2399 

3383 

787CIP2B_S0 

6287 

432 

1416 

2400 

3384 

787CIP2B 81 

6290 

433 

1417 

2401 

3385 

787CIP2B 82 

6293 

434 

1418 

2402 

3386 

787CIP2B_83 

6305 

435 • 

1419 

2403 

3387 

787CIP2B 84 

6308 

436 

1420 

2404 

3388 

787CIP2B 85 

6309 

437 

1421 

2405 

3389 

787CIP2B 86 

6312 

438 

1422 

2406 

3390 

787CIP2BJ7 

6314 

439 

1423 

2407 

3391 

787CIP2B_88 

6316 

440 

1424 

2408 

3392 

787CIP2B 89 

6336 

441 

1425 

2409 

3393 

787CIP2B 90 

6341 

442 

1426 

2410 

3394 

787CIP2B_91 

6343 

443 

1427 

2411 

3395 

787CIP2B 92 

6346 

444 

1428 

2412 

3396 

787CIP2B_93 

6357 

445 

1429 

2413 

3397 

787CIP2B 94 

6359 

446 

1430 

2414 

3398 

787CIP2B 95 

6367 

447 

1431 

2415 

3399 

787CIP2B 96 

6383 

448 

1432 

2416 

3400 

787CIP2B 97 

6385 

449 

1433 

2417 

3401 

787CIP2B_98 

6396 

450 

1434 

2418 

3402 

787CEP2B 99 

6396 

451 

1435 

2419 

3403 

787C1P2B 100 

6403 

452 

1436 

2420 

3404 

787CIP2B 101 

6405 

453 

1437 

2421 

3405 

787CIP2B 102 

6414 

454 

1438 

2422 

3406 

787CEP2BJ03 

6418 

455 

1439 

2423 

3407 

787CIP2B 104 

6422 

456 

1440 

2424 

3408 

787CIP2B 105 

6425 

457 

1441 

2425 

3409 

787CIP2B 106 

6436 

458 

1442 

2426 | 

3410 

787CIP2B 107 

6471 

459 

1443 

2427 

3411 

787CIP2B 108 

6474 

460 

1444 

2428 

3412 

787CIP2B 109 

6482 

461 

1445 

2429 

3413 

787CIP2B 110 

6504 

462 

1446 

2430 

3414 

787CIP2B 111 

6510 

463 

1447 

2431 

3415 

787CIP2BJ12 

6515 

464 

1448 

2432 

3416 

787CIP2BJI3 

6529 

465 

1449 

2433 

3417 

787CIP2B 114 

6535 

466 

1450 

2434 

3418 

787CIP2B 115 

6536 

467 

1451 

2435 

3419 

787CIP2B 116 

6536 

468 

1452 

2436 

3420 

787CIP2B_117 

6541 

469 

1453 

2437 

3421 , 

787CIP2BJ18 

6542 

470 

1454 

2438 

3422 

787CIP2B_119 

6547 

471 

1455 

2439 

3423 

787CIP2BJ20 

6548 

472 

1456 

2440 

3424 

787CIP2BJ21 

6552 

473 

1457 

2441 

3425 

787CIP2BJ22 

6552 
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474 

1458 

2442 

3426 

787C1P2B 123 

6555 

475 

1459 

2443 

3427 

787C1P2B 124 

6560 

476 

1460 

2444 

3428 

787C1P2B 125 

6566 

477 

1461 

2445 

3429 

787CIP2B 126 

6576 

478 

1462 

2446 

3430 

787CIP2B 127 

6584 

479 

1463 

2447 

3431 

787CIP2B 128 

6588 

480 

1464 

2448 

3432 

787CIP2B 129 

6589 

481 

1465 

2449 

3433 

787CIP2B 130 

6590 

482 

1466 

2450 

3434 

787CIP2B 131 

6597 

483 

1467 

2451 

3435 

787CIP2BJ32 

6600 

484 

1468 

2452 

3436 

787CIP2B 133 

6602 

485 

1469 

2453 

3437 

787CIP2B 134 

6604 

486 

1470 

2454 

3438 

787CIP2B 135 

6605 

487 

1471 

2455 

3439 

787CIP2B 136 

6608 

488 

1472 

2456 

3440 

787CIP2B_137 

6610 

489 

1473 

2457 

3441 

787CIP2B 138 

6614 

490 

1474 

2458 

3442 

787CIP2B 139 

6623 

491 

1475 

2459 

3443 

787CIP2B 140 

6629 

492 

1476 

2460 

3444 

787CIP2B 141 

6631 

493 

1477 

2461 

3445 

787CIP2B 142 

6631 

494 

1478 

2462 

3446 

787CIP2B 143 

6631 

495 

1479 

2463 

3447 

787CIP2BJ44 

6632 

496 

1480 

2464 

3448 

787CJP2BJ45 

6633 

497 

1481 

2465 

3449 

787CIP2B 146 

6634 

498 

1482 

2466 

3450 

787CIP2B 147 

6635 

499 

1483 

2467 

3451 

787CIP2B 148 

6639 

500 

1484 

2468 

3452 

787CIP2B 149 

6649 

501 

1485 

2469 

3453 

787C1P2B 150 

6651 

502 

1486 

2470 

3454 

787CIP2B 151 

6655 

503 

1487 

2471 

3455 

787CIP2B 152 

6658 

504 

1488 

2472 

3456 

787CIP2BJ53 

6667 

505 

1489 

2473 

3457 

787CIP2B 154 

6672 

506 

1490 

2474 

3458 

787CIP2BJ55 

6682 

507 

1491 

2475 

3459 

787CIP2B 156 

6683 

508 

1492 

2476 

3460 

787CIP2B 157 

6687 

509 

1493 

2477 

3461 

787CIP2B 158 

6687 

510 

1494 

2478 

3462 

787CIP2B 159 

6688 

511 

1495 

2479 

3463 

787CIP2B 160 

6696 

512 

1496 

2480 

3464 

787CIP2B 161 

6701 

513 

1497 

2481 

3465 

787CIP2B 162 

6707 

514 

1498 

2482 

3466 

787CIP2B 163 

6712 

515 

1499 

2483 

3467 

787CIP2BJ64 

6714 

516 

1500 

2484 

3468 

787CIP2B 165 

6720 

517 

1501 

2485 

3469 

787CIP2BJ 66 

6721 

518 

1502 

2486 

3470 

787CCP2B_167 

6722 

519 

1503 

2487 

3471 

787CIP2B 168 

6736 

520 

1504 

2488 

3472 

787CIP2B169 

6740 

521 

1505 

2489 

3473 

787CIP2BJ70 

6740 

522 

1506 

2490 

3474 

787CEP2BJ71 

6760 

523 

1507 

2491 

3475 

787CIP2BJ72 

6775 

524 

1508 

2492 

3476 

787CIP2B 173 

6784 

525 

1509 

2493 

3477 

787CIP2BJ74 

6793 

526 

1510 

2494 

3478 

787CIP2B 175 

6795 

527 

1511 

2495 

3479 

787C1P2BJ76 

6796 

528 

1512 

2496 

3480 

787CIP2B 177 

6807 

529 

1513 

2497 

3481 

787CIP2BJ78 

6808 ! 

530 

1514 

2498 

3482 

787CIP2BJ79 

6810 

531 

1515 

2499 

3483 

787CIP2B 180 

6815 

532 

1516 

2500 

3484 

787CIP2B 181 

6819 

533 

1517 

2501 

3485 

787CIP2BJ82 

6821 
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534 

1518 

2502 

3486 

787CIP2B 183 

6827 

535 

1519 

2503 

3487 

787CIP2BJ84 

6829 

536 

1520 

2504 

3488 

787CIP2B 185 

6830 

537 

1521 

2505 

3489 

787CIP2B 186 

6835 

538 

1522 

2506 

3490 

787CIP2B 187 

6848 

539 

1523 

2507 

3491 

787CIP2B 188 

6849 

540 

1524 

2508 

3492 

787CIP2B 189 

6851 

541 

1525 

2509 

3493 

787CIP2B 190 

6851 

542 

1526 

2510 

3494 

787CIP2BJ91 

6863 

543 

1527 

2511 

3495 

787CIP2B 192 

6869 

544 

1528 

2512 

3496 

787CEP2B 193 

6874 

545 

1529 

2513 

3497 

787CIP2BJ94 

6887 

546 

1530 

2514 

3498 

787CP2B 195 

6890 

547 

1531 

2515 

3499 

787CIP2B 196 

6894 

548 

1532 

2516 

3500 

787CIP2B 197 

6899 

549 

1533 

2517 

3501 

787CIP2B_198 

6900 

550 

1534 

2518 

3502 

787CIP2B_199 

6903 

551 

1535 

2519 

3503 

787CIP2B_200 

6910 

552 

1536 

2520 

3504 

787CIP2B201 

6913 

553 

1537 

2521 

3505 

787CIP2B 202 

6918 

554 

1538 

2522 

3506 

787CIP2B_203 

6923 

555 

1539 

2523 

3507 

787CIP2B 204 

6926 

556 

1540 

2524 

3508 

787CIP2B 205 

6929 

557 

1541 

2525 

3509 

787CIP2B 206 

6929 

558 

1542 

2526 

3510 

787CIP2B 207 

6932 

559 

1543 

2527 

3511 

787CIP2B 208 

6941 

560 

1544 

2528 

3512 

787CIP2B 209 

6951 

561 

1545 

2529 

3513 

787CIP2B 210 

6954 

562 

1546 

2530 

3514 

787CIP2B_211 

6954 

563 

1547 

2531 

3515 

787CIP2B 212 

6956 

564 

1548 

2532 

3516 

787CIP2B 213 

6957 

565 

1549 

2533 

3517 

787CIP2B 214 

6960 

566 

1550 

2534 

3518 

787CIP2B 215 

6966 

567 

1551 

2535 

3519 

787CIP2B 216 

6968 

568 

1552 

2536 

3520 

787CIP2B 217 

6969 

569 

1553 

2537 

3521 

787CIP2B 218 

6970 

570 

1554 

2538 

3522 

787CIP2B_219 

6971 

571 

1555 

2539 

3523 

787CIP2B 220 

6989 

572 

1556 

2540 

3524 

787CIP2B 221 

6990 

573 

1557 

2541 

3525 

787CIP2B_223 

6996 

574 

1558 

2542 

3526 

787CIP2B 224 

6997 

575 

1559 

2543 

3527 

787CIP2B 225 

7009 

576 

1560 

2544 

3528 

787CIP2B_226 

7016 

577 

1561 

2545 

3529 

787CIP2B_227 

7023 

578 

1562 

2546 

3530 

787CIP2B_228 

7023 

579 

1563 

2547 

3531 

787CIP2B 229 

7035 

580 

1564 

2548 

3532 

787CEP2B 230 

7038 

581 

1565 

2549 

3533 

787CEP2B 231 

7039 

582 

1566 

2550 

3534 

787CIP2B_232 

7040 

583 

1567 

2551 

3535 

787CEP2B 233 

7041 

584 

1568 

2552 

3536 

787CEP2BJ234 

7044 

585 

1569 

2553 

3537 

787CIP2BJ235 

7059 

586 

1570 

2554 

3538 

787CEP2B 236 

7060 

587 

1571 

2555 

3539 

787CIP2BJ237 

7063 

588 

1572 

2556 

3540 

787CIP2B_238 

7067 

< Oft 

1573 

2557 

3541 

787CEP2B 239 

7070 

590 

1574 

2558 

3542 

787CIP2BJ240 

7071 

591 

1575 

2559 

3543 

787CIP2B__241 

7079 

592 

1576 

2560 

3544 

787CIP2B_242 

7085 

593 

1577 

2561 

3545 

787CIP2BJ243 

7148 
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594 

1578 

2562 

3546 1 

787CIP2B 244 

7156 . 

595 

1579 

2563 

3547 

787CIP2B 245 

7156 

596 

1580 

2564 

3548 

787CIP2B 246 

7171 

597 

1581 

2565 

3549 

787C1P2B 248 

7265 

598 

1582 

2566 

3550 

787CIP2BJ249 

7268 

599 

1583 

2567 

3551 

787CIP2BJ250 

7308 

600 

1584 

2568 

3552 

787C1P2B_251 

7336 

601 

1585 

2569 

3553 

7S7C1P2B_252 

7347 

602 

1586 

2570 

3554 

787CIP2B_253 

7405 

603 

1587 

2571 

3555 

787CIP2B 254 

7405 

604 

1588 

2572 

3556 

787CIP2B 255 

7412 

605 

1589 

2573 

3557 

787CIP2B 256 

7412 

606 

1590 

2574 

3558 

787CIP2BJ257 

7436 

607 

1591 

2575 

3559 

787C1P2B 258 

7436 

608 

1592 

2576 

3560 

787CIP2B_259 

7454 

609 

1593 

2577 

3561 

787CIP2B_260 

7476 

610 

1594 

2578 

3562 

787CIP2B_261 

7598 

611 

1595 

2579 

3563 ! 

787C1P2B262 

7619 

612 

1596 

2580 

3564 

787CIP2B 263 

7644 

613 

1597 

2581 

3565 

787CIP2B 264 

7648 

614 

1598 

2582 

3566 

787CIP2B 265 

7659 

615 

1599 

2583 

3567 

787CIP2BJ266 

7661 

616 

1600 

2584 

3568 

787CIP2B_267 

7669 

617 

1601 

2585 

3569 

787CIP2B 268 

7686 

uio 

1602 

2586 

3570 

787CIP2B 269 

7686 


1603 

2587 

3571 

787CIP2B 270 

7694 


1604 

2588 

3572 

787C1P2B 271 

7697 




3573 

787CIP2B 272 

7733 


1606 

2590 

3574 

787CIP2B 273 

7734 

623 

1607 

2591 

3575 

787CIP2B_274 

7744 

624 

1608 

2592 

3576 

787CIP2B 275 

7751 

625 

1609 

2593 

3577 

787CIP2B 276 

7756 

626 

1610 

2594 

3578 

787CIP2B 277 

7761 

627 

161 1 

2595 

3579 

787CIP2B 278 

7761 


1612 

2596 

3580 

787CIP2B 279 

7776 


1613 

2597 

3581 ' 

787CIP2B 280 

7783 

630 

1614 

2598 

3582 

787CIP2B_281 

7800 

631 

1615 

2599 

3583 

787CIP2B__282 

7800 

632 

1616 

2600 

3584 

787CIP2B_283 

7801 

633 

1617 

2601 

3585 

787CIP2B 284 

7811 

634 

1618 

2602 

3586 

787CIP2B 285 

7817 

635 

1619 

2603 

3587 

787CIP2B 286 

7821 

636 

1620 

2604 

3588 

787CIP2B_287 

7822 

637 

1621 

2605 

3589 

787CIP2B_288 

7841 

638 

1622 

2606 

3590 

787CIP2B_289 

7847 

639 

1623 

2607 

3591 

787CIP2B 290 

7880 

640 

1624 

2608 

3592 

787CIP2B 291 

7910 

641 

1625 

2609 

3593 

787CIP2B 293 

7936 

642 

1626 

2610 

3594 

787CIP2B 294 

7945 

643 

1627 

2611 

3595 

787CIP2B 295 

7948 

644 

1628 

2612 

3596 

787CIP2B 296 

7963 

645 

1629 

2613 

3597 

787CIP2B 297 

7984 

646 

1630 

2614 

3598 

787CIP2B_298 

7985 

647 

1631 

2615 

3599 

787CIP2B_299 

8014 

648 

1632 

2616 

3600 

787CIP2B 301 

8029 

649 

1633 

2617 

3601 

787CIP2B 302 

8043 

650 

1634 

2618 

3602 

787CIP2B_303 

8164 

651 

1635 

2619 

3603 

787CIP2B 304 

8175 

652 

1636 

2620 

3604 

787CIP2B 305 

8250 

653 

1637 

2621 

3605 

787CIP2B 306 

8253 
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654 

1638 

2622 

3606 

787CIP2B 307 

8255 

655 

1639 

2623 

3607 

787CIP2B 308 

8258 

656 

1640 

2624 

3608 

787C1P2B 309 

8270 

657 

1641 

2625 

3609 

787CIP2B 310 

8271 

658 

1642 

2626 

3610 

787C1P2B 311 

8272 

659 

1643 

2627 

3611 

787CIP2B 312 

8279 

660 

1644 

2628 

3612 

787CIP2B 313 

8284 

661 

1645 

2629 

3613 

787CIP2B 314 

8285 

662 

1646 

2630 

3614 

787CIP2B_315 

8304 

663 

1647 

2631 

3615 

787CIP2B 316 

8309 

664 

1648 

2632 

3616 

787CIP2B 317 

8320 

665 

1649 

2633 

3617 

787CIP2B 318 

8331 

666 

1650 

2634 

3618 

787CIP2B 319 

8332 

667 

1651 

2635 

3619 

787CIP2B_320 

8332 

668 

1652 

2636 

3620 

787CIP2B_321 

8335 

669 

1653 

2637 

3621 

787CTP2B 322 

8337 

670 

1654 

2638 

3622 

787CIP2B 323 

8353 

671 

1655 

2639 

3623 

787CIP2B 324 

8355 

672 

1656 

2640 

3624 

787CIP2B_325 

8358 

673 

1657 

2641 

3625 

787CIP2B_326 

8361 

674 

1658 

2642 

3626 

787CIP2B 327 

8369 

675 

1659 

2643 

3627 

787C1P2B 328 

8385 

676 

1660 

2644 

3628 

787dP2B 329 

8397 

677 

1661 

2645 

3629 

787CIP2B 330 

8414 

678 

1662 

2646 

3630 

787CIP2B_331 

8431 

679 

1663 

2647 

3631 

787CIP2B_332 

8433 

680 

1664 

2648 

3632 

787C1P2B 333 

8444 

681 

1665 

2649 

3633 

787CIP2B__334 

8446 

682 

1666 

2650 

3634 

787CIP2B 335 

8460 

683 

1667 

2651 

3635 

787CIP2B_336 

8478 

684 

1668 

2652 

3636 

787C1P2B 337 

8490 

685 

1669 

2653 

3637 

787CIP2B 338 

8505 

686 

1670 

2654 

3638 

787CIP2B_339 

8523 

687 

1671 

2655 

3639 

787CIP2B_340 

8530 

688 

1672 

2656 

3640 

787CIP2B 341 

8533 

689 

1673 

2657 

3641 

787CIP2B 342 

8534 

690 

1674 

2658 

3642 

787CIP2B_343 

8536 

691 

1675 

2659 
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1948 

2932 

3916 

787CIP2D 34 

10244 

965 

1949 

2933 

3917 

787CIP2D_35 

10278 

966 

1950 

2934 

3918 

787CIP2EJ 

4251 

967 

1951 

2935 

3919 

787CIP2E 2 

5310 

968 

1952 

2936 

3920 

787CIP2E 3 

5697 

969 

1953 

2937 

3921 

787CIP2E_4 

5731 

970 

1954 

2938 

3922 

787CIP2E_5 

5733 

971 

1955 

2939 

3923 

787C1P2E_6 

5734 

972 

1956 

2940 

3924 

787CIP2E 7 

5740 

973 

1957 

2941 

3925 

787CIP2E 8 

•7657 

974 

1958 

2942 

3926 

787CIP2E 9 

9572 

975 

1959 

2943 

3927 

787CIP2F 1 

1363 

976 

1960 

2944 

3928 

787CIP2F_2 

4303 

977 

1961 

2945 

3929 

787CIP2F 3 

5760 

978 

1962 

2946 

3930 

787CIP2F 4 

5766 

979 

1963 

2947 

3931 

787CIP2F 5 

5767 

980 

1964 

2948 

3932 

787CEP2F 6 

5767 

981 

1965 

2949 

3933 

787CIP2F 7 

5770 

982 

1966 

2950 

3934 

787CCP2F 8 

6855 

983 

1967 

2951 

3935 

787CIP2F 9 

10026 

984 

1968 

2952 

3936 

787CIP2F 10 

10227 


TABLE 6 


SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
V=possible nucleotide insertion 

2953 

A 

3 

324 

ISEHRIEASGNYLAQRLTSSFLRGLSSWKSNPLML 
CGWTILLTLTMVQGEP*GP\KGIPG\FHTNSSYPH 
WGTVAKPPAGD*DLLPAPGQEGTPLFTR* SLCTY 
CPID 

2954 

A 

18 

467 

REELGKDLFDCTLYVLLKYDDFNADKHLALEEF j 

YRAFQVIQLSLPEDQKLSITAATVGQSAVLSCAIQ 

GTLRPPnWKRNNIILNNLDLEDINDFGDDGSLYIT 

KVTTTHVGNYTCYADGYEQVYQTHIFQVNVPPV 

IRVYPESQARRAG 

2955 

A 

3 

23 

FYSAFLVADKGIVTSKHNNDTQH1WESDSNEFSV 
IADPRGNTLGRGTTIT*VSIPPSL 

2956 

A 

1 

493 

RTKTDVYILNLAVADLLLLFTLPFWAVNAVHGW 

VLGKIMCKITSALYTLNFVSGMQFLACISIDRYV 

AVTKVPSQSGVGKPCWDCFCVWMAAILLSIPQL 

VFYTVNDNARCIPIFPRYLGTSMKALIQMLEICIG 

FVVPFLIMGVCYFITARTLMKMPNIKIS 

2957 

A 

703 

302 

EETGVREKRRERMKEKMWQNVLCCTLQTAVIL 
KLFQNKVLNILKNFFLSPLDTRKNKVFKKWAGG 
PGAVAHACNPSTLGGRGGRITKSGDRDHPGQHG 
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SEQ ID 

NO: 

Method 

Predicted 
beginning 

n nrlpnf irtp 

location 
corresponding 
to first amino 
acid residue or 
peptide 
sequence 

Predicted end 
nucleotide 

( ats t inn 

corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C^Cysteine, D^Aspartic Acid, 
C-Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
i — iMJicucinc, iv — Lysine, l, — i^eiicine, ivi — iviccniomne, 
N=Asparagine, P-=Proline, Q=Glutamine, R=Arginine, S=Sertne, 
T=Threonine, V= Valine, \V=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





ETRSLPACWAQWKSLALPVSRAPGRQGSLVVFP 
LP 

2958 

A 

575 

1054 

CTKCKADCDTCFNKNFCTKCKSGFYLHLGKCLD 

NCPEGLEANNHTMECVSIVHCEVSEWNPWSPCT 

KKGKTCGFKRGTETRVREIIQHPSAKGNLCPPTN 

ETRKCTVQRKXCQKGERGKXGRER^ 

ESKEAIPDSKSLESSKE1PEQRENKQQQ 

2959 

A 

1 

426 

LSMLSTISTEHRLSVLWPIWYCCHCPTHLSAVMC 
VLLWALSLLQSILEWMFCSFLFSDVDSDNWCQIL 
DFLTAVWL1FLIVLVLCGFTLVLLVRJICGSQKMPL 
TRLYVTILLTGLVFLFCSLPLSIQ*FLLYWIEKDLD 
DL 

2960 

A 

1194 

852 

EKRKTSYSQCLNSKQRNVSMRPSIWIHVHLKPPC 
RLVELLPFSSALQGLSIHULSLGTTLP/V*GHLRFRL 
RNLPQSLRTVILPERNEEQNLQELSHNADKYQM 
GDCCKEEIDDSIFY 

2961 

A 

274 

2250 

EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLN 

SLTPPTSVRJIMPL1TTVTLLKMVARHHMKLLCSK 

AFSTQLQQKIFLHSQMGIHHQSVCMKLKPNTSHI1 

SILMGQPMALVQLETLAPLTII1QKFQTQDHMKF 

WKNLPLHSHHLTPSVPQTVIPKKTGSPEIKLKITK 

TIQNGRELFESSLCGDLLNEVQASE\Q*NQSIESRK 

EKRKKSNKHDSSRSEERKSPIKIPKLEPEEQNRPN 

ER VDT V S EKJPREEP VLKEG SPSS ANTIFCSNNG S V 

HWKFQVGDLVWSKVGTYPWWPCMVSSDPQL 

EVHTK1NTRGAREYHVQFFSNQPERAWVHEKRV 

REYKGHKQYEELLAEATKQASNHSEKQKIRKPR 

PQRERAQWDIGIAHAEKALKMTREERIEQYTFIYI 

DKQPEEALSQAKKSVASKTEVKKTRRPRSVLNT 

QPEQTNAGEVASSLSSTEIRRHSQRitHTSAEEEEP 

PP VKIA WKTA A ARKS LP A S 1TMHKG SLDLQKCN 

MSPVVKIEQVFALQNATGDGKFIDQFVYSTKGIG 

NKTEISVRGQDRLIISTPNQRNEKPTQSVSSPEATS 

GSTGSVEKKQQRRSIRTRSESEKSTEWPKKKIK 

KEQVETVPQATVKTGLQKGSADRGVQGSVRFSD 

SSVSAAIEETVD 

2962 

A * 

2408 

836 

SASPPPPPPPPPSRFPFSGAPGARDRSGPLGSEPQR 

NPGARPRTLEATVTPPGSVGAMSSSGLNSEKVA 

ALIQKLNSDPQFVLAQNVGTTHDLLDICLKRATV 

QRAQHVFQHAVPQEGKPITNQKSSGRCWIFSCLN 

VMRLPFMKKLNIEEFEFSQSYLFFWDKVERCYFF 

LSAFVDTAQRKEPEDGRLVQFLLMNPANDGGQ 

WDMLVNIVEKYGVIPKKCFPESYTTEATRRMND 

ILNHKMREFCIRLRNLVHSGATKGEISATQDVM 

MEEIFRVVCICLGNPPETFTWEYRDKDKNNKKIG 

PVITPLEFNR/EQHVKPLFNMEDKICLVNDPRPQH 

KYNKLYTV\EYL\SNMVWRGEKLFYNNQPIDFLK 

KMVAASIKDGVEAVWFGCDVGKHF\NSKLG\LSD 

MNLYDHELVFGVSLKNMNKAER\LTFGES\LMT 

HTMTFTAV/SQSRDDSGMVLFTKWXRVGEFQWG 

EDHGH\KGYLCMTD*VGSLEYVYEVV/VWDRKH 

VPVEEVLAVLGAGNPFVLPAWDPMGALAE 

2963 

A 

90 

543 

RHYDSAGKITLK1AKNYLEQRAVGGASPRLAQS 

VLTCSREPILENSLTSLIEYLHNALEHDMRLRFNN 

DRMKTTIKETST*LSNSYLVFPLM*SLTYLMKMS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 

tn \*%ct iminn 
IU 1UM JI1MI1U 

acid residue of 

peptide 

sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
i=!so!eucine, K=Lysine, LF=Leucine» M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine» V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





FERCTARNKMFVNSPFTKVDNYCTASS\WKKFYL 
KCYFSLNTIKKEKKMT 

2964 

A 

3 

2454 

FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQRIS 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSKLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKKQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASLIPNDQLLPR 

KLNTEPKDVP/IACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAIPTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVT\VF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

AG WbDooQ V b>brbKUJNri 1 rNoOLioUyUJJoIvorvi i 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQP1VFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKLAVWPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHA1LQLFQGDQIWLRL 

HRGAIYGSSW 

2965 

A 

3 

2454 

FDTYRGLPSISNGNYSQLQFQAREYSGAPYSQR1S 

AITTVSVAWKVLSGKIGEGAEGNCKCVISEGAW 

AVCPTQPCGKAKPDKHLKDLLSKLLNSGYFESIP 

VPKNAKEKEVPLEEEMLIQSEKKTQLSKTESVKE 

SESLMEFAQPEIQPQEFLNRRYMTEVDYSNKQGE 

EQPWEADYARKPNLPKRWDMLTEPDGQEKKQE 

SFKSWEASGKHQEVSKPAVSLEQRKQDTSICLRS 

TLPEEQKKQEISKSKPSPSQWKQDTPKSKAGYVQ 

EEHKKQETPKLWPVQLQKEQDPKXQTPKSWTPS 

MQSEQNTTKSWTTPMCEEQDSKQPETPKSWENN 

VESQKHSLTSQSQISPKSWGVATASL1PNDQLLPR 

KLNTEPKDWAACASA*GFLPLQPPFRRI/HVLRK 

EKLQDLMTQIQGTCNFMQESVLDFDKPSSAffTS 

QPPSATPG*PRRHLKEQNLS\VKVIFFQGAVT\VF 

NVNAPLPPRKEQEIKESPYSPGYNQSFTTASTQTP 

PQCQLPSIHVEQTVHSQETANYHPDGTIQVSNGS 

LAFYPAQTNVFPRPTQPFVNSRGSVRGCTRGGRL 

ITNSYRSPGGYKGFDTYRGLPSISNGNYSQLQFQ 

AREYSGAPYSQRDNFQQCYKRGGTSGGPRANSR 

PVDVPVTNPAATILPVHVYPLPQQMRVAFSAAR 

TSNLAPGTLDQPIVFDLLLNNLGETFDLQLGRFN 

CPVNGTYVFIFHMLKLAVNVPLYVNLMKNEEVL 

VSAYANDGAPDHETASNHAILQLFQGDQIWLRL 

HRGAIYGSSW 

2966 

A 

1693 

227 

DYVLTAELHRQRSPGVSFGLSVFNLMNAIMGSGI 
LGLAYVMANTGVFGFSFLLLTVALLASYSVHLL 
LSMCIQTAYLGP*TNYFMVLPAH*LTCLPLDEFLQ | 
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SEQ1D 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-A!anine C=Cysteine, D=Aspartic Acid, 
E^GIutnmic Acid, F=Phenylalanine, G=Gtycine, H=Histidine, 
I— Isoleucine, K— Lysine, L^Leucine, M—Methionine, 
N=Asparagine, P^Proline, Q=G I uta mine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 





SL*NSL\*AVTSYEDLGLFAFGLPGKLVVAGTiriQ 

NIGAMSSYLL1IKTELPAAIAEFLTGDYSRYWYLD 

GQTLLIIICVGrVFPLALLPKlGFLGYTSSLSFFFM 

MFFALVVIIKKWSIPCPLTLNYVEKGFQISNVTDD 

CKPKLFHFSKESAYALPTMAFSFLCHTSILPIYCE 

LQSPSKKRMQNVTNTAIALSFLIYFISALFGYLTF , 

YD/GTTKAQRGE VTCHRIKDK VESELLKG* * *IP* 

SHDVVVMT\VKLCILFAVLL\TVPLIHFPARKAVT 

MMFFSNFPFSW1RHFLITLALNIIIVLLAIYVPDIRN 

VFGVVGASTSTCLIFIFPGLFYLKLSREDFLSWKK 

LGVGCFC/LLSFKTSILRNSLSVYIILPASRKSIYFK 

I 

2967 

A 

3 

3222 

SGIVVRALWREKKPGGGRRVKRRNPGRQAVGH 

TEEDPPRVGTPWKEHTGPGPQEGSTMEAAHAKT 

TEECLAYFGVSETTGLTPDQVKRNLEKYGLNELP 

AEEGKTLWELVEEQFEDLLVRILLLAACISFVLA 

WFEEGEETITAFVEPFVILLILIANAIVGVWQERN 

AENAJEALKEYEPEMGKVYRADRKSVQR1KARD 

IVPGDIVEVAVGDKVPADIRILAIKSTTLRVDQSIL 

TGEYVSVIKHTEPVPDPRAVNQDKKNMLFSGTNI 

AAGKALGIVATTGVGTEIGKIRDQMAATEQDKT 

PLQQKLDEFGEQLSKVISLICVAVWLINIGHFNDP 

VHGGSWFRGAIYYFKIAVALAVAAIPEGLPAVIT 

TCLALGTRRMAKKNAIVRSLPSVETLGCTSVICS 

DKTGTLTTNQMSVCKMFIIDKVDGDICLLNEFSIT 

GSTYAPEGEVLKNDKPVRPGQYDGLVELATICA 

LCNDSSLDFNEAKGVYEKVGEATETALTTLVEK 

MNVFNTDVRSLSKVERANACNSVIRQLMKKEFT 

LEFSRDRKSMSVYCSPAKSSRAAVGNKMFVKGA 

PEG VIDRCNYVRVGTTRVPLTGP VKEKIMA VIKE . 

WGTGRDTLRCLALATRDTPPKREEMVLDDSARJF 

LEYETDLTFVGVVGMLDPPRKEVTGSIQLCRDA 

GIRVIMITGDNKGTAIAICRRIGIFGENEEVADRA 

Y\TGREFDDL\PLAEQ\REACRRACCFARVEPSHK 

SKIVEYLQSYDEITAMTGDGVNDAPALKKAEIGI 

AMGSGTAVAKTASEMVLADDNFSTIVAAVEEGR 

AIYNNMKQFIRYLISSNVGEWCIFLTAALGLPEA 

LIPVQLLWVNLVTDGLPATALGFNPPDLDIMDRP 

PRSPKEPLRSGWLFFRYMAIGGYVGAATVGAAA 

WWFLYAEDGPHVNYSQLTHFMQCTEDNTHFEGI 

DCEVFEAPEPMTMALSVLVTIEMCNALNSLSEN 

QSLLRMPPWVNJWLLGSICLSMSLHFLILYVDPLP 

MIFKLRALDLTQWLMVLKISLPVIGLDEILKFVA 

RNYLEG*LFPLLHL*ARVTDPEDERRK 

2968 

A 

3 

2414 

GARSCSRLGRCTFPLWKGREMEVRKLSISWQFLI 

VLVLILQILSALDFDPYRVLGVSRTASQADIKKA 

YKKLAREWHPDKNKDPGAEDKFIQISKAYEILSN 

EEKRSNYDQYGDAGENQGYQKQQQQREYRFRH 

FHENFYFDESFFHFPFNSERRDSIDEKYLLHFSHY 

VNEVAPDSFKKJ>YLIKITSDWCFSCimEPVWKEV 

IQELEELGVGIGWHAGYERRLAHHLGAHSTPSI 

LGIINGK1SFFHNAVVRENLRQFVESLLPGNLVEK 

VTh«Q^YVRFLSGWQQENKPHVLLFDQTPIVPLL 

YKLTAFAYKDYLSFGYVYVGLRGTEEMTRRYNI 

NIYAPTLLVFKEHINRPADVIQARGMKKQIIDDFI 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^GIycine, H-Histidine, 
i=isoieucine, K=Lysine, L=Leucine, iVi=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





TRNKYLLAARLTSQKLFHELCPVKRSHRQRKYC 

VVLLTAETTKLSKPFEAFLSFALANTQDTVRFVH 

VYSNRQQEFADTLLPDSEAFQGKSAVSILERRNT 

AGRVVYKTLEDPWIGSESDKFILLGYLDQLRKDP 

ALLSSEAVLPDLTDELAPVFLLRWFYSASDY1SD 

CWDS1FHNNW\REMMPLLSLIFSALFILFGTVIVQ 

AFSDSNDERESSPPEKEEAQEKTGKTEPSFTKENS 

SKIPKKGFVEVTELTDVTYTSNLVRLRPGHMNV 

VLILSNSTKTSLLQKF ALE V YTFTG S SCLHFSFLSL 

DKHREWLEYLLEFAQDAAPIPNQYDKHFMERDY 

TGYVLALNGHKKYFCLFKPQKTVEEGGKP*GSC 

SDVDSSLYLGESRGKPSCGLGSRPIKGKLSKLSL 

WMERLLEGSLQRFYIPSWPELD 

2969 

A 

48 

1117 

KGLSPDQVLSAFAPLDCEMWLKVFTTFLSFATG 

ACSGLKVTVPSHTVHGVRGQALYLPVHYGFHTP 

ASDIQII WLFERPHTMPKYLLGSVNKSV VPDA'GI 

P/YTSSP*CHPMASLLINPLQFPDEGNYIVKVNIQG 

NGTLSASQKIQVWDDPVTKPVVQIHPPSGAVEY 

VGNMTLTCHVEGGTRI AYOWT KNGRPVHTSST 

YSFSPQNNTLHIAPVTKEDIGNYSCLVRNPVSEM 

ESDIIMPIIYYGPYGLQVNSDKGLKVGEVFTVDL 

GEAILFDCSADSmPNTYSWlRRTDNTTYIIKHGP 

RLEVASEKVAQKTMDYVCCAYNNITGRQDETHF 

TVIITSVGMCDIQGRDPNKT 

2970 

A 

68 

936 

HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYADL 

QFQNSSEMEKIPEIGKFGEKAPPAPSHVWRPAAL 

FLTLLCLLLLIGLGVLASMFHVTLKIEMKKiMNKL 

ONISEELORNISLOLMSNMNnSNKIlWLSTTLOTI 

ATKLCRELYSKEQEHKCKPCPl^WIWHKDSCYF 

LSDDVQTWQESKMACAAQNASLLKINNKNALE 

FIKSQSRSYDYWLGLSPEEDS/YSWYESG*YNQ\P 

SAWI1WAPDLNNMYCGYINRLYVQYYHCTYK 

QRMICEKMANPVQLGSTYFREA 

2971 

A 

912 

2287 

VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAPRF 

LVAFAYWNHYLSCTSPCSCYRPLCRLNFGLNVV 

ENLALLVLTYVSSSEDF/TWVPG*GRSGEVFPEGT 

GLPLPHSDLPTSWCGHSLQCGSQSSFPPAIHENAF 

IVFUSSLGH1V1LLTCILW1^TKKHTVSQE\DGLSL 

AGAPRQPRRKSRTSVLRIRVMVRWELSSNGNPG 

RGVLGLGLGLGNKLRVVGQNLGL*HCVWVVWE 

TGE*KRWRLQMGIE*GVASRRQ*VRNSVRGLVC 

rINSSAPPIvrYMGFFSPTVFGGGVGG*LrWTFILHP 

PEVEAAGIPLLLGPSLPQRQGREHIVV1LAAPACA 

PFHDR*WEPREIRPSP*ELGLRGEPTLSYPASCRVI 

RQPIP*D1^SYSWKQRLFIINFISFFSALAVYFRHN 

M YCEAG VYTTFAILEYTVVLTNMAFHMTA WWD 

FGNKELL1TSQPEEKRF 

2972 

A 

1734 

246 

GGII^GRDGRTALPRPREPAERTAGLRKDMRPQE 

LPRLAFPLLLLLLLLLPPPPCPAHSATRFDPTWES 

LDARQLPAWFDQAKFGIFIHWGVFSVPSFGSEWF 

WWYWQBCEKIPKYVEFMKDNYPPSFKYEDFGPL 

FTAKFrTSTANQWADIFQASGAKYlVLTSKHHEGF 

TLWGXSEYSWNAVNAIDEGPKRDIVKELEVAIRNR 

TDLRFGLYYSLFEWFWLFLEDESSSFHKRQFPVS 

KTLPELYELVNNYQPEVLWSDGDGGAPDQYWN 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D-Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
1— Isoleucine, K— Lysine, L— Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Thrconine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





STGFLAWLYNESPVRGTVVTNDRWGAGSICKHG 

GFYTCSDRYNPGHLLPHKWENCMTIDKLSWGY 

RREAGISDYLTIEELVKQLVETVSCGGNLLMNIG 

PTLDGTISVVFEERLRQMGSWLKVNGEAIYETHT 

WRSQNDTVTPDVWYTSKPKEKLVYA1FLKWPTS 

GQLFLGHPKAILGATEVKLLGHGQPLNWISLEQN 

GIMVELPQLTIHQMPCKWGWALALTNVI 

2973 

A 

24 

1133 

SVPRAGGDMETGAAELYDQALLGILQHVGNVQ 

DFLRVLFGFLYRKTDFYRLLRHPSDRMGFPPGAA 

QALVLQVFKTFDHMARQDDEKRRQELEEKIRRK 

EEEEAKTVSAAAAEKEPVPVPVQEEEIDSTTELDG 

HQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGA 

AEVPRNEPPILPRIQEQFQKNPDSYNGAVRENYTW 

SQDYTDLEVRVPVPKHVVKGKQVSVALSSSSIRV 

AMLEENGERVLMEGKLTHKINTESSLWSLEPGK 

CVLVNLSKVGEYWWNAILEGEEPIDIDKINKERS 

MATVDEEEQAVLDRLTFDYHQKLQGKPQSHEL 

KVHEMLKKGWDAEGSPFRGQRFDPAMFNISPGA 

VQF 

2974 

A 

271 

1854 

MQFGRAHGDCVSGAQLCGCPSMDDYMVLRMIG 

EGSFGRALLVQHESSNQMFAMKE1RLPKSFSNTQ 

NSRKEAVLLAKMKHPNIVAFKESFEAEGHLY1V 

MEYCDGGDLMQKIKQQKGKLFPEDMILNWFTQ 

MCLGVNHIHKKRVLHRDIKSKNIFLTQNGKGKL 

GDFGSARLLSNPMAFACTYVGTPYYVPPEIWEN 

LPYNNKSDIWSLGCILYELCTLKHPFQANSWKNL 

ILKVCQGCISPLPSHYSYELQFLVKQMFKRNPSH 

RPSATTLLSRGIVARJLVQKCLPPEIIMEYGEEVLE 

EIKNSKJTOTPRKKTNPSRJRIALGNEASTVQEEEQ 

DRKGSHTDLESINENLVESALRRVNREEKGNKSV 

HLRKASSPNLHRRQ WEKN VPNTALTALENA SILT 

SSLTAEDDRGGSVIKYSKNTTRKQWLKETPDTLL 

NILKNADLSLAFQTYTIYRPGS\EGFLKGPLSEETE 

ASDSVDGGHDSVILDPERLEPGLDEEDTDFEEED 

DNPDWVSELKKRAGWQGLCDR 

2975 

A 

32 

2833 

PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAVVQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CMERCGSGS 

\LQDIYHVTGPLSELQIAYVSRETLQGLYYLHSKG 

KMHRDIKGAN1LLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRIHSTSRNVREEKTRSEIT 

FGQVKFDPPLRKETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAK1P 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny1a1anine, G=G»ycine, H=Histidine, 
i^lsoleucine, K=Lysinc, L=Lcticine, M=Me»hiof!ine, 
N=Asparagine, P=Proline t Q=Glutamtne, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





RQMQKLPVAIPAHKLPDRJLPRKFSVSAKIPETK 
WCQKCCVVRNPYTGHKYLCGALQTSIVLLEWV 
bi M^iVr MLliSJrillJr r lr i^r LlSJvir civil v v rcyc i r 

LVCVGVSRGRDFNQVVRFETVNPNSTSSWFTES 

DTPQTNVTHVTQLERDTILVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRJESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVVVLES 

RPTDNPTANSNLYILAGHENSY 

2976 

A 

32 

2833 

PPGEPGAGRGALSPCGPLSGPPPLPGREAGGTCG 

QPVNPVFDLSRRNPQEDFELIQRIGSGTYGDVYK 

ARNVNTGELAAIKVIKLEPGEDFAVVQQEIIMMK 

D\CKHP\DIVAYF\GSYL\RRDKLWI\CIvIEF\CGSGS 

\LQDIYHVTGPLSELQ1AYVSRETLQGLYYLHSKG 

KMHRDIKGANILLTDNGHVKLADFGVSAQITATI 

AKRKSFIGTPYWMAPEVAAVERKGGYNQLCDL 

WAVGITAIELAELQPPMFDLHPMRALFLMTKSNF 

QPPKLKDKMKWSNSFHHFVKMALTKNPKKRPT 

AEKLLQHPFVTQHLTRSLAIELLDKVNNPDHSTY 

HDFDDDDPEPLVAVPHRJHSTSRNVREEKTRSEIT 

FGQVKFDPPLRXETEPHHELPDSDGFLDSSEEIYY 

TARSNLDLQLEYGQGHQG\GYFLGANKSLLKSV 

EEELHQRGHVAHLEDDEGDDDESKHSTLKAKIP 

PPLPPKPKSIFIPQEMHSTEDENQGTIKRCPMSGSP 

\AKPSQVPPRPPPPRLPPHKPVALGNGMSSFQLNG 

ERDGSLCQQQNEHRGENLSRKEKKDVPKPISNG . 

LPPTPKVHMGACFSKVFNGCPLKIHCASSWINPD 

TRDQYLIFGAEEGIYTLNLNELHETSMEQLFPRR 

CTWLYVMNNCLLSISGKASQLYSHNLPGLFDYA 

RQMQKLPVAIPAHKLPDRJLPRKFSVSAKIPETK 

WCQKCCVVRNPYTGHKYLCGALQTSIVLLEWV 

TTnkAr\TST?\A1 TVUTTYCPTD/^PT VA/fPPA/fT VA/PPDPVP 

EPMQKxMLlivHiUrrlrCri^ vrcyiirr 

LVCVGVSRGRDFNQWRFETVNPNSTSSWFTES 

DTPQTNVTHVTQLERDTILVCLDCCIKIVNLQGR 

LKSSRKLSSELTFDFRIESIVCLQDSVLAFWKHG 

MQGRSFRSNEVTQEISDSTRIFRLLGSDRVVVLES 

RPTDNPTANSNLYILAGHENSY 

2977 

A ' 

174 

1543 

YSLRKGITFKLAGAMVHIKKGELTQEEKELLEVI 
GKGTVQEAGTLLSSKNVRVNCLDENGMTPLMH 
AAYKGKLDMCKLLLRHGADVNCHQHEHGYTA 
LMFAALSGNKDITWVMLEAGAETDVVNSVGRT 
AAQ1VL\AFVGQHDCVT1INNFFPRERLDYYTKPQ 
GLDKEPKLPPKLAGPLHKIITTTNLHPVKIVMLV 
NENPLLTEEAALNKCYRVMDLICEKCMKQRDM 
NEVLAMKMHYISCIFQKCINFLKDGENKLDTLIK 

QQLVRSIAPVEIGSDPTAFSVLTQA1TGQVGFVDV 

EFCTTCGEKGASKRCSVCKMVIYCDQTCQKTHW 

FTHKKICKNLKDIYEKQQLEAAKEKRQEENHGK 

LDVNSNCVNEEQPEAEVGiSQKDSNPEDSGEGK 

KESLESEAELEGLQDAPAGPQVSEE 

2978 

A 

3 

5177 

SDDLRTGLFQDVQDAESLKLPGVYEVLFYNETE 
DCPGMMLWRYPEPRGLTLVRJTPVPFNTTEDPDI 
STADLGDVLQDPCSLEYWDELQKVFVAFREFNL 
SESKVCELQLPDINLVNDQKKLVSSDLWRIVLNS 
SQNGADDQSSASESGSQSTCDPLVTPTALAACTR 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I— Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possib!c nucleotide insertion 





VDSCFTPWFVPSLCVSFQFAHLEFHLCHHLDQLG 

TAAPQYLQPFVSDRNMPSELEYMIVSFREPHMYL 

RQWNNGSVCQEIQFLAQADCKLLECRNVTMQS 

VVKPFSEFGQMAVSSDVVEKLLDCTVIVDSVFVN 

LGQHVVHSLNTAIQAWQQNKCPEVEELVFSHFV 

ICNDTQETLRFGQVDTDENILLASLHSHQYSWRS 

HKSPQLLHICIEGWGNWRWSEPFSVDHAGTFIRT 

IQYRGRTASLIIKVQQLNGVQKQIIICGRQIICSYL 

SQSIELKVVQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLESK 

APEYSrVIQVPSSNSSIIYVWCTVLTLEPNSQVQQ 

RMIVFSPLFIMRSHLPDPI1IHLEKRSLGLSETQIIP 

GKGQEKPLQNIEPDLVHHLTFQAREEYDPSDCA 

VPISTSLIKQIATKVHPGGTVNQILDEFYGPEKSL 

QPIWPYNKKDSDRNEQLSQWDSPMRVKLSIWKP 

YVRTLLIELLPWALLINESKWDLWLFEGEKIVLQ 

VPAGKIIIPPNFQEAFQIGIYWANTNTVHKSVAIK 

LVHNLTSPKWKDGGNGEVVTLDEEAFVDTEIRL 

GAFPGHQKLCQFCISSMVQQGIQIIQIEDKTTIINN 

TPYQIFYKPQLSVCNPHSGKEYFRVPDSATFSICP 

GGEQPAMKSSSLPCWDLMPDISQSVLDASLLQK' 

QIMLGFSPAPGADSSQCWSLPAIVRPEFPRQSVA 

VPLGNFRENGFCTRAIVLTYQEHLGVTYLTLSED 

PSPRVIIHNRCPVKMLIKENIKDIPKFEVYCKKIPS 

ECS1HHELYHQISSYPDCKTKDLLPSLLLRVEPLD 

EVTTEWSDAIDINSQGTQVVFLTGFGYVYVDVV 

HQCGTVFITVAPEGKAGPILTNTNRAPEKIVTF/K 

MFITQLSLAVFDDLTHHKASAELLRLTLDNIFLC 

VAPGAGPLPGEEPVAALFELYCVEICCGDLQLDN 

QLYNKSNFHFAVLVCQGEKAEPIQCSKMQSLLLS 

NKELEEYKEKCFIKLCITLNEGKSILCDrNEFSFEL 

KPARLYVEDTFVYYIKTLFDTYLPNSRLAGHSTH 

LSGGKQVLPMQVTQHARALVNPVKLRKLVIQPV 

NLLVSIHASLKLYIASDHTPLSFSVFERGPIFTTAR 

QLVHALAMHYAAGALFRAGWVVGSLDILGSPA 

SLVRSIGNGVADFFRLPYEGLTRGPGAFVSGVSR 

GTTSFVKHISKGTLTS1TNLATSLARNMDRLSLDE 

EHYNRQEEWRRQLPESLGEGLRQGLSRLGISLLG 

AIAGIVDQPMQNFQKTSEAQASAGHKAKGVISG 

VGKGIMGVFTKPIGGAAELVSQTGYGILHGAGLS 

QLPKQRHQPSDWHADQAPNSHVKYVWKMLQS 

LGRPEVHMALDVVLVRGSGQEHEGCLLLTSEVL 

FVVSVSEDTQQQAFPVTEIDCAQDSKQNNLLTV 

QLKQPRVACDVEVDGVRERLSEQQYNRLVDYIT 

KTSCHLAPSCSSMQIPCPVVAAEPPPSTVKTYHY 

LVDPHFAQVFLSKFTMVKNKALRKGFP 

2979 

A 

255 

2673 

AWLFPASVLCPRCLTGSAVGSAEWKSLVVLFPFS 

SRPTLGHLDSKPSSKSNMIRGRNSATSADEQPfflG 

NYRLLKTIGKGNFAKVKLARHILTGKEVAVKIDD 

KTQLNSSSLQKLFREVRIMKVLNHPNIVKLFEVIE 

TEKTLYLVMEYASGGEVFDYLVAHGRMKEKEA 

RAKFRQ1VSAVQYCHQKF1VHRDLKAENLLLDA 

DMNIK1ADFGFSNEFTFGNKLDTFCGSPPYAAPEL 

FQGKKYDGPEVDVWSLGVILYTLVSGSLPFDGQ 

NLKELRERVLRGKYRIPFYMSTDCENLLKKFLIL 


220 


WO 01/57190 


PCT/LS01/04098 


SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=A!anine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
i=isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^Giutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possiblc nucleotide insertion 





NPSKRGTLEQIMKDRWMNVGHE\DDELKPYGEP 

LP\DYKDPRRTELMVSMGYTREE1QDSLVGQRYN 

EVMATYLLLGYKSSELEGDTITLKPRPSADLTNS 

SAPSPSHKVQRSVSANPKQRRFSDQAGPAIPTSNS 

YSKKTQSNNAENKRPEEDRESGRKASSTAKVPA 

SPLPGLERKKTTPTPSTNSVLSTSTNRSRNSPLL\E 

RASUGQGFHPEWAKTALTMPGSRASTASASAA 

VSAARPRQHQKSMSASVHPNKASGLPPTESNCE 

VPRPRQVCWGSCTAPQRVPVASPSAHNISSSGGA 

PDRTNFPRGVSSRSTFHAGQLRQVR\DQQNLPYG 

VTPASPSGHSQGRRGASGSIFSKFTSKFVRRNLNE 

PESKDR\VETLRPHVV\NSGGNDKEKJBEFREAKPR 

SLRFTWSMKTTSSMEPNEMMREIRKVLDANSCQ 

SELHEKYMLLCMHGTPGHEDFVQWEMEVCKLP 

RLSLNGVRFKRJSGTSMAFKNIASKIANELKL 

2980 

A 

120 

3433 

NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMLARCPKSAETNIDQDINNLKEKWESVE 

TKLNER\KTAKLEEALNLA\MEFHNSL\QDFINWLT 

QAEQTLNVASRPSLILDTVLFQIDEHKVFANEVN 

SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 

QSRWEKVVQRLVERGRSLDDARKRAKQFHEAW 

SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEAV 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGBCRTSSVQALKRSA 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSVVHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICHPDSITTIKHWmiRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGYIDYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRILRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGL1TTAAARVRTQF ADSKKTPSRPGSRAG SKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 

2981 

A 

120 

3433 

NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKP 

LGGLPETAKEQLNVHMEVCAAFEAKEETYKSLM 

QKGQQMLARCPKSAETN1DQDINNLKEKWESVE 

TKLNER\KT\KLEEALNLA\MEFHNSL\QDFINWLT 

QAEQTLNVASRPSLELDTVLFQIDEHKVFANEVN 

SHREQIIELDKTGTHLKYFSQKQDVVLIKNLLISV 

QSRWEKV VQRLVERGRSLDDARKRAKQFHE A W 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K— Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop cod on, ^possible nucleotide deletion, 
\=possibJe nucJcotidc insertion 





SKLMEWLEESEKSLDSELEIANDPDKIKTQLAQH 

KEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADD 

NLKLDDMLSELRDKWDTICGKSVERQNKLEEAX 

LLFSGQFTDALQALIDWLYRVEPQLAEDQPVHG 

DIDLVMNLIDNHKAFQKELGKRTSSVQALKRSA 

RELIEGSRDDSSWVKVQMQELSTRWETVCALSIS 

KQTRLEAALRQAEEFHSWHALLEWLAEAEQTL 

RFHGVLPDDEDALRTLIDQHKEFMKKLEEKRAE 

LNKATTMGDTVLAICrffDSITTIKHWITIIRARFEE 

VLAWAKQHQQRLASALAGLIAKQELLEALLAW 

LQWAETTLTDKDKEVIPQEIEEVKALIAEHQTFM 

EEMTRKQPDVDKVTKTYKRRAADPSSLQSHIPV 

LDKGRAGRKRFPASSLYPSGSQTQIETKNPRVNL 

LVSKWQQVWLLALERRRKLNDALDRLEELREF 

ANFDFDIWRKKYMRWMNHKKSRVMDFFRRIDK 

DQDGKITRQEFIDGILSSKFPTSRLEMSAVADIFD 

RDGDGY1DYYEFVAALHPNKDAYKPITDADKIE 

DEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 

FGDSQQLRLVRJLRSTVMVRVGGGWMALDEFL 

VKNDPCRAKGRTNMELREKFILADGASQGMAA 

FRPRGRRSRPSSRGASPNRSTSVSSQAAQAASPQ 

VPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 

ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGE 

DSGLiTTAAARVRTQFADSKKTPSRPGSRAGSKA 

GSRASSRRGSDASDFDISEIQSVCSDVETVPQTHR 

PTPRAGSRPSTAKPSKIPTPQRKSPASKLDKSSKR 

2982 

A 

1 

2065 

MAAGGAEGGSGPGAAMGDCAEIKSQFRTREGF 

YKLLPGDGAARRSGPASAQTPVPPQPPQPPPGPA 

SASGPGAAGPASSPPPAGPGPGPALPAVRLSLVR 

LGEPDSAGAGEPPATPAGLGSGGDRVCFNLGRE 

LYFYPGCCRRGSQRWHTPLTPFLPPLKSIDLNKPI 

DKRIYKGTQPTCHDFNQFTAATETISLLVGFSAG 

QVQYLDLIKKDTSKLFNEERLIDKTKVTYLKWLP 

ESESLFLASHASGHLYLYNVSHPCASAPPQYSLL 

KQUWGFSFYAAKSKAPRNPLAKWAVGEGPLNE 

FAFSPDGRHLACVSQDGCLRVFHFDSMLLRGLM 

KSYFGGLLCVCWSPDGRYVVTGGEDDLVTVWS 

rTEGRVVARGHGHKSWYNAVAFDPYTTRAEEA 

ATAAGADGERSGEEEEEEPEAAGTGSAGGAPLSP 

LPKAGSITYRFGSAGQDTQFCLWDLTEDVLYPHP 

PL ARTRTLPGTPGTTPP AASS SRGGEPGPGPLPRS 

LSRSNSLPHPAGGGKAGGPGVAAEPGTPFSIGRF 

ATLTLQERRDRGAEKEHKJIYHSLGNISRGGSGG 

SGSGGEKPSGPVPRSRLDPAKVLGTALCPRIHEV 

PLLEPLVCKK1AQERLTVLLFLEDCIITACQEGLIC 

TWARPGKAFTDEETEAQTGEGSWPRSPSKSVVE 

GISSQPGNSPSGTVV 

2983 

A 

3855 

220 

RRFRLSAHRAQPCCRCRGLEMPRGVFQQLSNLV 

LQELNANLSNLTSAFEKATAEKIKCQQEADATN 

RVILLANRLVGGLASENIRWAESVENFRSQGVTL 

CGDVLLISAFVSWGYFTKKYRNELMEKFWIPYI 

HNLKVPIPITNGLDPLSLLTDDADVATWNNQGLP 

SDRMSTENATILGNTERWPLIVDAQLQGIKWIKN 

KYRSELKAIRLGQKSYLDVIEQATSEGDTLLIENI 

GETVDPALDPLLGRNTIKKGKYIKIGDKEVGVPP 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Clutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vafine, W=Tryptophan, V=Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





QVPPDPTHQVLQPTLQARDAGSVHVLrNFLVTRD 

GLEDQLLAAVVAKERPDLEQLKANLTKSQNEFK 

IVLKELEDSLLARLSAASGNFLGDTALVENLETT 

KHTASEIEEKVVEAKITEVKINEARENYRPAAER 

ASLLYFILNDLNKrNPVYQFSLKAFNVVFEKAIQR 

TTPANEVKQRVINLTDEITYSVYMYTARGLFERD 

KL1FLAQVTFQVLSMKKELNPVELDFLLRFPFKA 

GVVSPVDFLQHQGWGGIKALSEMDEFKNLDSDI 

EGSAKRWKKLVESEAPEKEIFPKEWKNKTALQK 

LCMVRCLRPDRMTYAJKNFVEEKMGSKFVEGRS 

VEFSKSYEESSPSTSIFFILSPGVDPLKDVEALGKK 

LGFTIDNGKLHNVSLGQGQEVVAENALDVAAEK 

GHWVILQNIHLVARWLGTLDKKLERYSTGRHED 

YRVFIRAEPAPSPETHIIPQGILENAIKITNEPPTGM 

YANLYKALDLFTQDTLEMCTKEMEFKCMLFAL 

CYFHAVVAERRKFGAQGWNRSYPFNNGDLTISI 1 

NVLYNYLEANPKVPWDDLRYLFGEIMYGGHITD 

DWDRRLCRTYLAEYIRTEMLEGDVLLAPGFQIPP 

NLDYKGYHEY1DENLPPESPYLYGLHPNAEIGFL 

TVTSEKLFRTVLEMQPKETDSGAGTGVSREEKV 

KAVLDDILEKIPETFNMAE1MAKAAEKTPYVVV 

AFQECERMNILTNEMRRSLKELNLGLKGELTITT 

DVFDT STAT FYDTVPDTWV A R A VPSMTV/tnT AAW 

YANLLLRIRELEAWTTDFALPTTVWLAGFFNPQS 

FLTAIMQSMARJCNEWPLDKMCLSVEVTKKNRE 

DMTAPPRJEGSYVYGLFMEGARWDTQTGVIAEA 

RLKELTPAMPVIFIKAIPVARMETKNIYECPVYKT 

RIRGPTYVWTFNLKTKEKAAKWILAAVALLLQV 

2984 

A 

2 

1464 

FVLFPGIAMETPGASASSLLLPAASRPPRKREAGE 

AGAATSKQRVLDEEEYIEGLQTVIQRDFFPDVEK 

LQAQKEYLEAEENGDLERMRQIAIKFGSALGKM 

SREPPPPYVTPATFETPEVHAGTGVVGNKPRPRG 

RGLEDGEAGEEEEKEPLPSLDVFLSRYTSEDNAS 

FQEIMEVAKERSRARHAWLYQAEEEFEKRQKDN 

LELPSAEHQAIESSQASVETWKYKAKNSLMYYP 

EGVPDEEQLFKKPRQVVHKNTRFLRDPFSQALSR 

CQLQQAAALNAQHKQGKVGPDGKELIPQESPRV 

GGFGFVATPSPAPGVNFSPMMTWGFVFNTPI RV 

EGSETPYVDRTPGPAFKILEPGRRERLGLKMANE 

AAAKNRAKKQEALRRVTENLASLTPKGLSPAMS 

PALQRLVSRTASKYTDRALRASYTPSPARSTHLK 

NPGPVGCRPPQSTPGA/PGSATRTPL\TQDPA\SIT 

DNLLQLPARRKASDFF 

2985 

A 

1890 ; 

178 

ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQXQDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVT1ESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 
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NO: 
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Predicted 

beginning 

nucleotide 
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corresponding 

to first amino 

acid residue of 

peptide 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenyla!anine, OGIycine, H=Histidine, 
I=Isolcucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Giutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon,/=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 

2986 

A 

1890 

178 

ASTQEAGLLSPPGVGAQRCWNFVACLPVRACAD 

MASNDYTQQATQSYGAYPTQPGQGYSQQSSQP 

YGQQSYSGYSQSTDTSGYGQSSYSSYGQSQNSY 

GTQSTPQGYGSTGGYGSSQSSQSSYGQQSSYPGY 

GQQPAPSSTSGSYGSSSQSSSYGQPQSGSYSQQPS 

YGGQQQSYGQQQSYNPPRGYGQQNQYNSSSGG 

GGGGGGGGSYGQDQSSMSGSGGGGGGGGGGGS 

GGGGGYGNQDQTGAAGSRGYRQ\QDRGGRCRG 

GSGGGGS\GGAAGYNRSSGGYEPRGRGGGRGGR 

GGMGGSDRGGFNKFGGPRDQGSRHDSEQDNSD 

NNTIFVQGLGENVT1ESVADYFKQIGIIKTNKKTG 

QPMINLYTDRETGKLKGEATVSFDDPPSAKAAID 

WFDGKEFSGNPIKVSFATRRADFNRGGGNGRGG 

RGRGGPMGRGGYGGGGSGGGGRGGFPSGGGGG 

GGQQRAGDWKCPNPTCENMNFSWRNECNQCK 

APKPDGPGGGPGGSHMGGNYGDDRRGGRGGYD 

RGGYRGRGGDRGGFRGGRGGGDRGGFGPGKM 

DSRGEHRQDRRERPY 

2987 

A 

1376 

898 

GGAKAGGAPHPFTLPFRHVGGLSAAPEEVEGML 

WAGARQHGRNWRKRETSPGTQGPLPPVPR/VPP 

GPDG\PHAIAPTLSWAIPRQQCSPQPGRLNALPPD 

RCSGPHFGDRAPESCFPGACSVSGACAFKGTRPA 

CPPQEPSLRSSRNRLREGQTFGRME1 

2988 

A 

1 

1011 

MGNDSVSYEYGDYSDLSDRPVDCLDGACLAIDP 

LRVAPLPLYAAIFLVGVPGNAMVAWVAGKVAR 

RRVGATWLLHLAVADLLCCLSLPILAVPIARGGH 

WPYGAVGCRALPSIILLTMYASVLLLAALSADLC 

FLALGPAW\CLRFS/GACGVQVACGAAWTLALL 

LTVPSAIYRRLHQEHFPARLQCVVDYGGSSSTEN 

AVTAIRFLFGFLGPLVAVASCHSALLCWAARRC 

RPLGTAIVVGFFVCWAPYHLLGLVLTVAAPNSA 

LLARALRAEPLIVGLALAHSCLNPMLFLYFGRAQ 

LRRSLPAACHWALRESQGQDESVDSKKSTSHDL 

VSEMEV 

2989 

A 

27 

4074 

KSQLFCFWVGKAGDILSGDQDKEQKDPYFVETP 

YGYQLDLDFLKYVDD1QKGNTIKRLNIQKRRKPS 

VPCPEPRTTSGQQGIWTSTESLSSSNSDDNKQCP 

NFLIARSQVTSTPISKPPPPLETSLPFLTIPENRQLP 

PPSPQLPKHNLHVTKTLMETRRRLEQERATMQM 

TPGEFRRPRLASFGGMGTTSSLPSFVGSGNHNPA 

KHQLQNGYQGNGDYGSYAPAAPTTSSMGSSIRH 

SPLSSGISTPVTNVSPMHLQHIREQMAIALKRLKE 

LEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRA 

ASQINVCGVRKRSYSAGNASQLEQLSRARRSGG 

ELYIDYEEEEMETVEQSTQRIKEFRQUTADMQA 

LEQKIQDSSCEASSELRENGECRSVAVGAEENMN 

DIWYHRGSRSCKDAAVGTLVEMRNCGVSVTEA 

MLGVMTEADKEIELQQQTEESLKEKJYRLEVQLR 

ETTHDREMTKLKQELQAAGSRKKVDKATMAQP 
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NO: 
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beginning 

nucleotide 
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corresponding 

to first amino 

acid residue of 

peptide 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
i-lsoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 





LVFSKVVEAWQTRDQMVGSHMDLVDTCVGTS 

VETNSVGISCQPECKNKVVGPELPMNWWIVKER 

VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTE 

ESVNDLTLLKTNLNLKEVRSIGCGDCSVDVTVCS 

PKECASRGVNTEAVSQVEAAVMAVPRTADQDT 

STDLEQVHQFTNTETATLIESCTNTCLSTLDKQTS 

TQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLL 

SGHSGFDRPSAVKTKESGVGQININDNYLVGLK 

MRTIACGPPQLTVGLTASRRSVGVGDDPVGESLE 

NPQPQAPLGMMTGLDHYIERJQKLLAEQQTLLA 

ENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVM 

KSASTEELRNPDFQKTSLGKITGSYLGYTCKCGG 

LQSGSPLSSQTSQPEQEVGTSEGKPISSLDAFPTQ 

EGTLSPVNLTDDQIAAGLYACTNNESTLKSIMKK 

KDGNKDSNGAKKNLQFVGINGGYETTSSDDSSS 

DESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAE 

GHHAVNIEGLKSARVEDEMQVQECEPEKVEIRE 

RYELSEKMLSACNLLKNTINDPKALTSKDMRFC 

LNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAISP 

D VLRYVINLADGNGNTALHY S V SHSNFEI VKLLL 

DADVCNVDHONKAGYTPIMLAAI AAVFAFKDM 

RJVEELFGCGDVNAKASQAGQTALMLAVSHGRI 

DMVKGLLACGADVNIQDDEGSTALMCASEHGH 

VEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGH 

KDIAVLLYAHVNFAKAQSPGTPRLGRKTSPGPTH 

RGSFD 

2990 

A 

69 

1687 

ERLRPGQRAIRGPVPAAGACASLPPRAGPAQGRH 

AALGGAEPGSHLHCGVRLQRREEPGGQQRLLPQ 

RGGSAQTGHQHPGPYECQCPGPQPGGTTPALLSL 

ILEETRGPPASANPDKDHSTQPGTMGRKK1QISRJ 

LDQRNRQVTFTKJRKFGLMKKAYELSVLCDCEIA 

LIIFNSATRLFQYASTDMDRVLLKYTEYSEPHESR 

TNTDILETLKRRGIGLDGPELEPDEGPEEPGEKFR 

RLAGEGGDPALPRPRL YPAAPAMPSPD V VYGAL 

PPPG\CDPSGLGEALPAQSRPSPFRPAAPKAGPPG 

LGHPLFSPSHLTSKTPPPLYLPTEGRRSDLPGGLA 

GPRGGLNTSRSLYSGLONPCSTATPGPPLGSFPFL 

PGGPPVGAEAWARRVPQPAAPPRRPPQSSIKSER 

LFLRPPGAPATFLRPSPIPCSSPGPWQSLCGLGPPV 

CAGCPWPTAGPGRRSPGGTSPERSPGTARARGDP 

\TSLQAFSEKTHTVTAPLRGGGLEVGGWTQSSAG 

GLLSFFLFVCISTNKNARGVRGPEKK 

2991 

A 

3 

1159 

IPQPLHCASPKEEMSLRCGDAARTLGPRVFGRYF 

CSPVRPLSSLPDKKKELLQNGPDLQDFVSGDLAD 

RSTWDEYKGNLKRQKGERLRLPPWLKTEIPMGK 

NYMCLKNTLRNLNLHTVCEEARCPN1GECWGGG 

EYATATATIMLMGDTCTRGCRFCSVKTARNPPP 

LDASEPYNTAKAIAEWGLDYWLTSVDRDDMP 

DGGAEHIAKTVSYLKERNPKILVECLTPDFRGDL 

KA1EKVALSGLDVYAH>TV^TVPELQSKVRDPRA 

NFDQSLRVLKHAKKVQPDVISKTSIMLGLGENDE 

QVYATMKALREADVDCLTLGQYMQPTRRHLKV 

EEYITPEKFKYWEKVGNELGFHYTASGP\LVRSS 

YKAGEFFLKNLVAKRKTKDL 

2992 

A 

3 

1636 

PVPGVPTSPPSCCPQDMQGPWVLLLLGLRLQLSL 
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NO: 
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nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E^CIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lso(eucine, K=Lysine, L= Leu cine, M=Mcthionine, 
N^Asparagine, P=Proline, Q^GIutamine, R=Argintne, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





GVIPAEEENPAFWNRQAAEALDAAKKLQPIQKV 

AKNL1LFLGDGLGVPTVTATRJLKGQKNGKLGPE 

TPLAMDRFPYLALSKTYNVDRQVPDSAATATAY 

LCGVKANFQTIGLSAAARFNQCNTTRGNEVISV 

MNRAKQAGKSVGVVTTTRVQHASPAGTYAHTV 

NRNWYSDADMPASARQEGCQDIATQL1SNMDID 

VrLGGGRKYMFPMGTPDPEYPADASQNGIRLDG 

KNLVQEWLAKHQGAWYVWNRTELMQASLDQS 

VTHLMGLFEPGDTKYEIHRDPTLDPSLMEMTEA 

ALRLLSRNPRGFYLFVEGGR1DHGHHEGVAYQA 

LTEAVMFDDAIERAGQLTSEEDTLTLVTADHSH 

VFSFGGYTLRGSSIFGLAPSKAQDSKAYTSILYGN 

GPGYVFNSGVRPDVNESESGSPDYHQQAGWPLS 

SETHGGEDVAVFARGPQAHLVHGVQEQSFVAH 

VMAFAACLEPYTACDLAPPACTTDAAHPVAASL 

PLLAGTLLLLGASAAP 

2993 

A 

3 

685 

DAWARLLKMNRLFGKAKPKAPPPSLTDCIGTVD 

SRAESIDKK1SRLDAELVKYKDQIKKMREGPAKN 

MVKQKALRVLKQICRMYEQQRDNLA\NSHSTW\ 

TS\HYTIQSLKDTKTTVDAMKLGVKEMKKAYKQ 

VKJDQIEDLQDQLEDMMEDANEIQEALSRSYGTP 

ELDEDDLEAELDALGDELLADEDSSYLDEAASA 

PAIPEGVPTDTKNKDGVLVDEFGLPQ1PAS 

2994 

A 

1710 

161 

RRCELTPFIIKTLILPKSWGAFPEDVVMQHVSSSQ 

SSQRHVQWPGACPGAGEEQPACSQPSLPLTLPSP 

SHQLQQLMVRGGPAGGQNMNVDLQGVGPGLQ 

GSPQVTLAPLPLPSPTSPGFQFSAQPRRFEHGSPS 

Y1QVTSPLSQQVQTQSPTQPSPGPGQALQNVRAG 

APGPGLGLCSSSPTGDFVDASVLVRQISLSPSSGG 

HFVFQDGSGLTQIAQGAQVQLQHPGTPITVRERR 

PSQPHTQSGGTIHHLGPQSPAAAGGAGLQPLASP 

SHITTANLPPQISSIIQGQLVQQQQVLQGPPLPRPL 

GFERTPGVLLPGAGGAAGFGMTSPPPPTSPSRTA 

VPPGLSSLPLTSVGNTGMKKVPKKLEEIPPASPE 

MAQMRKQCLDYHHQEMQALKEVFKEYLIELFF 

LQHFQGNMMDFLAFKERLYGPLQAYLRQNDLDI 

EEEEEE\HFEVINDEVKVVARKHGQPGTPVAIAT\ 

QLPPRTSAAFPAQQQPLQVLSDGSTVQLPRLSSL 

GFEDSMC 

2995 

A 

3 

924 

SAPSGIDASTHAFARCKHPINVRRDPSIPIYGLRQS 

ILLNTRLQDCYVDSPALTNIWMARTCAKQNINAP 

APATTSSWEWRNPLIASSFSLVKLVLRRQLKNK 

CCPPPCKFGEGKLSKJILKJ1KDDSVMKATQQARK 

RNFISSKSKQPAGHRRPAGGIRESKESSKEKKLTV 

RQDLEDRYAEHVAAT\QALPQDSGTAAWKG\RV 

LLPETQKRQQLSEDTLTIHGLPTEGYQALYHAVV 

EPMLWNPSGTPKRYSLELGKA1KQKLWEALCSQ 

GAISEGAQRDRFPGRKQPGVHEEPVLKKWPKLK 

SKK 

2996 

A 

3 

1713 

GKFGIKPSQRRISGKSTFHSEMEGEDTRDDSLYSI 

LEELWQDAEQIKRCQEKHNKLLSRTTFLNKKILN 

TEWDYEYKDFGKFVHPSPNLILSQKRPHKRDSFG 

KSFKJ1NLDLHIHNKSNAAKNLDKTIGHGQVFTQ 

NSSYSHHENTHTGVKFCERNQCGKVLSLKHSLS 

QNVKFPIGEKANTCTEFGKIFTQRSHFFAPQKIHT 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F~Phenylalanine, G=Glycine, H=Histidine, 
I^Isoleucine, K^Lysine, L=Leucine, M=Methiomp.e, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





VEKPHELSKCVNVFTQKPLLSIYLRVHRDEKLYIV 

CTKM/CGKGLHPRNSELIMHEKTHTREKPYKCNE 

\CGKSFFQVSSLLRHQTTHTGEKLFECSECGKGFS 

LNSALNIHQKIHTGERHHKCSECGKAFTQKSTLR 

MHQRIHTGERSYICTQCGQAFIQKAHL1AHQRIH 

TnFKPYFrSDCGKSFPSKSOLOMHKRlHTGEKPY 

ICTECGKAFTNRSNLNTHQKSHTGEKSYICAECG 

KAFTDRSNFNKHQTIHTGEKPYVCADCGRAFIQK 

SELITHQR1HTTEKPYKCPDCEKSFSKKPHLKVHQ 

RJHTGEKPYICAECGKArTDRSNFNKHQTIHTGD 

KPYKCSDCGKGFTQKSVLSMHRNIHT 

2997 

A 

3 

1763 

AASTRTMGSRHFEGIYDHVGHFGRFQRVLYFICA 

FQNISCGIHYLASVFMGVTPHHVCRPPGNVSQVV 

FHNHfSNWSLEDTGALLSSGQKDYVTVQLQNGEI 

WELSRCSRNKRENTSSLGYEYTGSKKEFPCVDG 

YIYDQNTWKSTAVTQWNLVCDRKWLAMLIQPL 

FMFGGPTGIG/VTFGYRSDRLGRRVVLWATSSS 

MFLFGIAAAFAVDYYTFMAARFFLAMVASGYLV 

VGFV YVMEFIGMKSRTWASVHLHSFFAVGTLLV 

ALTGYLVRTWWLYQMILSTVTVPFILCCWVLPE 

TPFWLLSEGRYEEAQKMVDIMAKWNRASSCKLS 

ELLSLDLQGPVSNSPTEVQKHNLSYLFYNWSITK 

RTLTVWLIWFTGSLGFYSFSLNSVNLGGNEYLNL 

ALACGVVMVIPQKHYILGVVTAMWGKILPIGAA 
FG\LIYLYTAELYPTIVRSLAVGSGSMVCRLAS1L 
APFSVDLSSIWIFIPQLFVGTMALLSGVLTLKLPE 
TLGKRLATT WEEAAKLESENESKSSKLLLTTNN S 
GLEKTEAITPRDSGLGE 

2998 

A 

3 

1441 

QRPASQLLAPFAAEALPGAPRAAMAQHFSLAAC 

DVVGFDLDHTLCRYNLPESAPLIYNSFAQFLVKE 

KGYDKELLNVTPEDWDFCCKGLALDLEDGNFL 

KLANNGTVLRASHGTKMMTPEVLAEAYGICKEW 

KHFLSDTGMACRSGKYYFYDNYFDLPGALLCAR 

VVDYLTKLNNGQKTFDFWKDIVAAIQHNYKMS 

AFKENCGIYFPEIKRDPGRYLHSRPESVKKWLRQ 

LKNAGKILLLITSSHSDYCRLLCAWILGNDFTDLF 

DIVITNALKPGFFSHLPSQRPFRTLENDEEQEALP 

<U nKPnWYSOGNAVHLYELLKKMTGKPEPKVV 

YFGDSMHSDIFPARHYSNWETVLILEELRGDEGT 

RSQRPEESEPLEKKGKYEGPKAKPLNTSSKKWGS 

FFMDSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 

EAIAELPLDYKFTRFSSSNSKTAGYYPNPPLVLSS 

DETLISK 

2999 

A 

320 

2417 

LRRRKMTPQSLLQTTLFLLSLLFLVQGAHGRGHR 

EDFRFCSQRNQTHRSSLHYKPTPDLRISIENSEEA 

LTVHAPFPAAHPASRSFPDPRGLYHFCLYWNRH 

AGRLHLLYGKRDFLLSDKASSLLCFQHQEESLAQ 

GPPLLATSVTSWWSPQNISLPSAASFTFSFHSPPH 

TGAHNASVDMCELKRDLQLLSQFLKHPQKASRR 

PSAAPASQQLQSLESKLTSVRFMGDMGSFEEDRI 

NATVWKLQPTAGLQDLH1HSRQEEEQSEIMEYS 

VLLPRTLFQRTKGRSGEAEKRLLLVDFSSQALFQ 

DKNSSQVLGEKVLGIVVQNTKVANLTEPVVLTF I 

QHQLQPKNVTLQCVFWVEDPTLSSPGHWSSAGC 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalaninc, C=Glycine, H=Histidine t 
I=Isoleucine, K=Lysine > L=Leucine, MNMethionine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R«Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





ET VRRETQTSC FCNHLTYF A VLMV S S VE VD A VH 

KHYLSLLSYVGCVVSALACLVTIAAYLCSRVPLP 

CRRKPRDYTIKVHMNLLLAVFLLDTSFLLSEPVA 

LTGSEAGCRASAIFLHFSLLTCLSWMGLEGYNLY 

RLVVEVFGTYVPGYLLKLSAMGWGFPIFLVTLV 

ALVDVDNYGPIILAVHRTPEGVIYPSMCWIRDSL 

VSYITNLGLFSLVFLFNMAMLATMVVQILRLRPH 

TQKWSHVLTLLCLSLVLG\LPWALIFFSFASGTFQ 

LVVLYLFSIITSFQGFLIFIWYWSMRLQARGGPSP 

LKSNSDSARLPISSGSTSSSRI 

3000 

A 

66 

1003 

SRGQLDAGQSSEQHGGNRQPEQSRSRSSSSSSSP 

RRSRSAAEPAMALSMPLNGLKEEDKEPLIELFVK 

AGSDGESIGNCPFSQRLFM1LWLKGVVFSVTTVD 

LKRKPADLQNLAPGTHPPFITFNSEVKTDVNKIEE 

FLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSA 

YIKNSRPEANEALERGLLKTLQKLDEYLNSPLPD 

EIDENSMEDIKFSTRKFLDGNEMTLADCNLLPKL 

HIVKVVAKKYRNFDIPKEMTGIWRYLTNAYSRD 

EFTNTCPSDKEVEIVAYSDVAKRLHQVKSRLLKE 

VSFMSSP 

3001 

A 

779 

2006 

LALTFRSALSTLPGSPMTSSGSPDLQLAWGPSLLP 

HPPSVWSPALPSCFAGPCPLLPLSDTQGWWGPN 

WLAPPSAALCRPDAAVWPDLPSSNILLVTPPPAK 

*SAVAV*PCPRGAHSLERAARQYTISGSSTSQSGK 

CSKRDTKCCAVTTSWGCFWQKHWKGDEDSGW 

AFQEGSHLGEGHL 

3002 

A 

909 

2799 

VEEAWTVWLHWGVRECLLEEETNQKEEAASSN 

WTKA RGPFWQEDWV WDMRLKMTTRNFPEREV 

PCDVEVERFTREVPCLSSLGDGWDCENQEGHLR 

QSALTLEKPGTQEAICEYPGFGEHLIASSDLPPSQ 

RVLATNGFHAPDSNVSGLDCDPALPSYPKSYAD 

KRTGDSDACGKGFNHSMEVIHGRNPVREKPYKY 

PESVKSFNHFTSLGHQKIMKRGKXSYEGKNFENI 

FTLSSSLNENQRNLPGEKQYRCTECGKCFKRNSS 

LVLHHRTHTGEKPYTCNECGKSFSKNYNLIVHQ 

RIHTGEKPYECSKCGKAFSDGSALTQHQRIHTGE 

KPYECLECGKTFNRNSSLELHQRTHTGEKPYRCN 

ECGKPFTDISHLTVHLRIHTGEKPYECSKCGKAF 

RDGSYLTQHERTHTGEKPFECAECGKSFNRNSHL 

IVHQKIHSGEKPYECKECGKTFIESAYLIRHQRJH 

TGEKPYGCNQCQKLFRNIAGLIRHQRTHTGEKPY 

ECNQCGKAFRDSSCLTKHQRIHTKETPYQCPECG 

KSFKQNSHLAVHQRLHSREGPSRCPQCGKMFQK 

SSSLVRHQRAHLGEQPMET*WLGAT*VFQFTLTP 

VFRRRVLDLTPLWSVEKNPLSYPVN 

3003 

A 

2 

1489 

SLTEHLSFFQPTAHSLTSLLGTMTTCSRQFTSSSS 

MKGSCGIGGGIGGGSSRISSVLAGGSCRAPSTYG 

GGLSVSSRFSSGGACGLGGGYGGGFSSSSSFGSG 

FGGGYGGGLGAGFGGGLGAGFGGGFAGGDGLL 

VGSEKVTMQNLNDRLASYLDKVRALEEANADL 

EVKIRDWYQRQRPSEIKDYSPYFKTIEDLRNKIIA 

ATIENAQPILQIDNARLAADDFRTKYEHELALRQ 

TVEADVNGLRRVLDELTLARTDLEMQIEGLKEE 

LA YLRKNH* EEMLALRGQTGGE VNVETD AAPG 

VDLSCILNEMRNQYEQMAEKNRRDAETWFLSKT 


228 


WO 01/57190 


PCT/USO 1/04098 


SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny1alanine, G=Glycine, H=Histidine, 
I=Isoleuctne, K=Lysine, L=Leucine, M ^Methionine, 
N=Asparagine, P=Proline, Q=G!utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





EELNKEVASNSELVQSSRSEVTELRRVLQGLEIEL 
QSQLSMKASLENSLEETKGRYCMQLSQIQGLIGS 
VEEQLAQLRCEMEQQSQEYQILLDVKTRLEQEIA 
TYRRLLEGEDAHLSSQQASGQSYSSREVFTSSSSS 
SSRQTRP1LKEQSSSSFSQGQSS 

3004 

A 

2 

940 

GCAPDTRFFVPEPGGRGAAPWVALVARGGCTFK 

DKVLVAARRNASAVVLYNEERYGNITLPMSHAG 

TGNIVVIMISYPKGREILELVQKGIPVTMTIGVGT 

RHVQEFISGQSVVFVAIAFITMMIISLAWLIFYYIQ 

tvr L. Y I Oov^lOo^orllvJSJil JPuvVlUl^L>L.L.rl 1 ViSJrlvjE. 

KGIDVDAENCAVCIENFKVKDIIRILPCKHIFHRJC 

IDPWLLDHRTCPMCKLDVIKALGYWGEPGDVQE 

MPAPESPPGRDPAANLSLALPDDDGSDESSPPSA 

SPAESEPQCDPSFKGDAGENTALLEAGRSDSRHG 

GP1S 

3005 

A 

184 

2552 

TMTIHQFLLLFLFWVCLPHFCSPEIMFRRTPVPQQ 

RILSSRVPRSDGK1LHRQKRGWMWNQFFLLEEY 

TGSDYQYVGKLHSDQDKGDGSLKYILSGDGAGT 

LFIIDEKTGDIHATRRIDREEKAFYTLRAQAINRR 

TLRPVEPESEFVDCIHDINDNEPTFPEEIYTASVPE 

MSVVGTSVVQVTATDADDPSYGNSARVIYSILQ 

GQPYFSVEPETGIIRTALPNMNRENREQYQVVIQ 

AKDMGGQMGGLSGTTTVNITLTDVNDNPPRFPQ 

NTIHLRVLESSPVGTAIGSVKATDADTGKNAEVE 

YPJIDGDGTDMFD1VTEKDTQEGIITVKKPLDYES 

RRLYTLKVEAENTHVDPRFYYLGPFKDTTIVKISI 

EDVDEPPVFSRSSYLFEVHEDIEVGTIIGTVMARD 

PDSISSPIRFSLDRHTDLDRIFNIHSGNGSLYTSKP 

LDRELSQWHNLTVIAAEINNPKETTRVAVFVR1L 

DANDNAPQFAVFYDTFVCENARPGQLIQTISAVD 

KDDPLGGQKFFFSLAAVNPNFTVQDhfEDNTARIL 

TRKNGFNRHEISTYLLPVVISDNDYPIQSSTGTLTI 

LCniLLVIVVLFAALKRQRKKEPLILSKEDTRDNIV 

SYNDEGGGEEDTQAFDIGTLRNPAAffiEKKLRRD 

IIPETLFIPRRTPTAPDNTDVRDFINERLKEHDLDP 

TAPPYDSLATYAYEGNDSIAESLSSLESGTTEGD 

QNYDYLREWGPRFNKLPQKYGGGESDKDS 


A 
rx 


J41 

ijrJtYV UK. 1 W WOJvo VvjliVlJL#l cJLJci]S^AL.XN olllJ V xxlivi 

SLIKGNFHAVYRDDLKKLLETECPQYIRKKGAD 

VWFKELDINTDGAVNFQEFLILVIKMGVAALNSII 

DVYHKYSLIKGNFHAVYRDDLQKLLETECPQYI 

RKKGADVWFKELDINTDGAVNFQEFLILV1KMG 

VGSPQKXVASYF 

3007 

A 

1 

1253 

MYEGIRCLLKALLGFVSLAIGTLYCPRQYRPFPG 

SLGIEAINVPEPIPDSYYRDMATWPTHAPSVEEG 

GQGRFGNQADHFLGSLAFAKLLNRSLAVPSWIE 

YQHHKPPFTNLHVSYQKYFKLEPLQAYHRVISLE 

DFMEKLAPTHWPPEKRVAYCFEVAAQRSPDKKT 

v^a ivjjrvJU/Vji>i 1 vjr 1 v» n v or lNrwoijiji 1 vjiji oao 

YREQWSQRFSPKEHPVLALPGAPAQFPVLEEHRP 

LQKYMVWSDEMVKTGEAQIHAHLVRPYVGIHL 

RIGSDWKNACAIVILKIXjTAGSHFMASPQCVGYS 

RSTAAPLTMTMCLPDLKEIQRAVKLWVRSLDAQ 

SVYVATDSESYVPELQQLFKGKVKVVSLKPEVA 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G1utamic Acid, F=PhenylaIanine, G=GIycine, H=Histidine, 
(=Isoleucine, K=Lysine, b=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





QVDLYILGQADHFIGNCVSSFTAFVKRERDLQGR 
PSSFFGMDRPPKLRDEF 

3008 

A 

3136 

1898 

TARGGGSEPGPTMA ANYSSTSTRREH VK VKTSS 
QPGFLERLSETSGGMFVGLMAFLLSFYLIFTNEG 
RALKTATSLAEGLSLVVSPDSIHSVAPENEGRLV 
HIIGALRTSKJLLSDPNYGVrlLPAVKI.RRHVEMY 
Q WVETEESRE YTEDGQ VKKETR YS YNTE WRSEII 
NSKNFDREIGHKNPRAMAGESFMATAPFVQIGRF 
FLSSGLIDKVDNFKSLSLSKLEDPHVDIIRRGDFF 
YHSFNPKYPFVGDI RVSFSYAG1 ^GDDPni HP A 

HVVTV1ARQRGDQLVPFSTKSGDTLLLLHHGDFS 

AEEVFHRELRSNSMKTWGLRAAGWiVIAMFMGL 

NLMTRILYTLVDWFPVFRDLVNIGLKAFAFCVAT 

SLTLLTVAAGWLFYRPLWALLIAGLALVPILVAR 

TRVPAKKLE 

3009 

A 

93 


DA AVAMTAOGGT VANRfiRRFlfWATFl <\flPnr;f; 

SRGRSDRGSGQGDSLYPVGYLDKQVPDTSVQET 

DR1LVEKRCWDIALGPLKQIPMNLFIMYMAGNTI 

SIFPTMMVCMMAWRPIQALMAISATFKMLESSS 

QKFLQGLVYLIGNLMGLALAVYKCQSMGLLPTH 

ASDWLAFIEPPERMEFSGGGLLL 

3010 

A 

2 

1041 

LIDSAKARYWTQRGTWVYDNALLLLLKCLWSN 

VVPECTMASSNTVLMRLVASAYSIAQKAGMIVR 

RVIAEGDLGIVEKTCATDLQTKADRLAQMSICSS 

LARKFPKLTIIGEEDLPSEEVDQELIEDSQWEEILK 

QPCPSQYSAIKEEDLVVWVDPLDGTKEYTEGLL 

n>JVTVT TniAVFnk'ATAr;vT>jnpvYTJVFAf~;priA\/ 

LGRTIWGVLGLGAFGFQLKEVPAGKHI1TTTRSH 
SNKL VTDC V A AMNPDA VLR VG G AGNKI I QL IEG 
KASAYVFASPGCKKWDTCAPEV1LHAVGGKLTD 
IHGNVLQYHKDVKJHMNSAGVLATLRNYDYYAS 
RVPESIKNALVP 

3011 . 

A 

291 

1452 

SPQKTMRSHTITMTTTSVSSWPYSSHRMRFITNH 

SDQPPQNFSATPNVTTCPMDEKLLSTVLTTSYSVI 

FIVGLVGNIIALYVFLGIHRKRNSIQIYLLNVAIAD 

LLLIFCLPFRIMYHINQNKWTLGVILCKVVGTLFY 

MNMYISIILLGFISLDRYIKINRSIQQRKAITTKQSI 

YVCCIVWMLALGGF1 TM1TT TT KKGGHNSTMCF 

HYRDKHNAKGEAIFNFILVVMFWLIFLLnLSYIKI 

GKNLLRISKRRSKFPNSGKYATTARNSFIVLIIFTI 

CFVPYHAFRFTYISSQLNVSSCYWKEIVHKTNEIM 

LVLSSFNSCLDPVMYFLMSSNIRKIMCQLLFRRF 

QGEPSRSESTSEFKPGYSLHDTSVAVKIQSSSKST 

3012 

A 

246 

1346 

TEPVGYTKAEEPIAMRSLGALLLLLSACLAVSAG 

PVPTPPDNIQVQENFNISRIYGKWYNLAIGSTCPW 

LKKIMDRMTVSTLVLGEGATEAEISMTSTRWRK 

GVCEETSGAYEKTDTDGKFLYHKSKWNITMESY 

VVHTNYDEYAIFLTKKFSRHHGPTITAKLYGRAP 

ot rftt t onFRVVAnnvnTPFri^TFTMAnRnFrv 

PGEQEPEPILlPRVRRAVLPQEEEGSGGGQLVTEV 

TKKEDSCQLGYSAGPCMGMTSRYFYNGTSMAC 

ETFQYGGCMGNGNNFVTEKECLQTCRTVAACN 

LPIVRGPCRAFIQLWAFDAVKGKCVLFPYGGCQ 

GNGNKFYSEKECREYCGVPGDGDEELLRFSN 

3013 

A 

67 

379 

RQMALLKANKDLISAGLKEFSVLLNQQVFNDPL 
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NO: 

Method 
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beginning 
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location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
i oca lion 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D= As par tic Acid, 
E=Clutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
i=Isoleucine, K=Lysinc, L- Leucine, M=Methionine, 
N^Asparagine, P^Proline, Q=Gtutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unkno\vn, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





VSEEDMVTVVEDWMNFYINYYRQQVTGEPQER 
DKALQELRQELNTLANPFLAKYRDFLKSHELPSH 
PPPSS 

3014 

A 

1 

373 

GTSWSTLRAVMSASVVSVVSRVLEEYLSSTPQRL 
KLLDAYLLYILLTGALQFGYCLFVLTFHFNSLLLF 
FFFCVGSFHSNVYFLLFTLSFLCFLFIAYFFLIRFFS 
LFIWFFHVFFIELSLFYF 

3015 

A 

2 

1321 

AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERKITRNQK^XHDEINHVQKTYAEMDP 

TTAALEKEHEAITKVKYVDKIHIGNYEIDAWYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHKIYCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFL1AFSYELSKLESTVGSPEKPLSDLGKLSYRSY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDIIST 

LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 

KXPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 

AVT 

3016 

A 

2 

1321 

AAAEGTAPSPGRVSPPTPARGEPEVTVEIGETYLC 

RRPDSTWHSAEVIQSRVNDQEGREEFYVHYVGF 

NRRLDEWVDKNRLALTKTVKDAVQKNSEKYLS 

ELAEQPERK1TRNQKRKHDEINHVQKTY AEMDP 

TTA ALEKEHEA ITKVKYVDKIHIGNYEID A WYFS 

PFPEDYGKQPKLWLCEYCLKYMKYEKSYRFHLG 

QCQWRQPPGKEIYRKSNISVYEVDGKDHK1YCQ 

NLCLLAKLFLDHKTLYFDVEPFVFYILTEVDRQG 

AHIVGYFSKEKESPDGNNVACILTLPPYQRRGYG 

KFLIAFSYELSKLESTVGSPEKPLSDLGBCLSYRSY 

WSWVLLEILRDFRGTLSIKDLSQMTSITQNDIIST 

LQSLNMVKYWKGQHVICVTPKLVEEHLKSAQY 

KKPPITGGWGAAVCRGRWGSVSIWTGRSQGLLI 

AVT 

3017 

A 

38 

704 

EAHPGGQLGSERNGVRMDEDVLTTLKILIIGESG 

VGKSSLLLRFTDDTFDPELAATIGVDFKVKTISVD 

GNKAKLAIWDTAGQERFRTLTPSYYRGAQGVIL 

VYDVTRRDTFVKLDNWLNELETYCTRNDIVNM 

LVGNKIDKENREVDRNEGLKFARKHSMLFIEAS 

AKTCDGVQCAFEELVEKIIQTPGLWESENQNKG 

VKLSHREEGQGGGACGGYCSVL 

3018 

A 

2640 

2861 

APVLILQMVKLSIVLTPQFLSHDQGQLTKELQQH 
VKSVTCPCEYLRKVSECRQMGPGALEQFPGLSC 
HTSHSG 

3019 

A 

1307 

711 

PGITMAASLVGKKIVFVTGNAKKLEEWQILGDK 

FPCTLVAQKJDLPEYQGEPDEISIQKCQEAVRQV 

QGPVLVEDTCLCFNALGGLPGPYIKWFLEKLKPE 

GLHQLLAGFEDKSAYALCTFALSTGDPSQPVRLF 

RGRTSGRIVAPRGCQDFGWDPCFQPDGYEQTYA 

EMPKAEKNAVSHRFRALLELQEYFGSLAA 

3020 

A 

1202 

180 

VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAA 
LVFYSCIFIIGLFWITALWWSCTTKXRTTVTIYM 
MNVALVDLIFnviTLPFRMFYYAKDEWPFGEYFC 
QILGALTVFYPSIALWLLAFISADRYMAIVQPKY 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L= Leu cine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrine, 
T-Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





AKELKNTCKAVLACVGVW1MTLTTTTPLLLLYK 

DPDKX)STPATCLKISDIIYLKAVNVLNLTRLTFFF 

LIPLFIM1GCYLVIIHNLLHGRTSKLKPKVKEKSIRI 

IITLLVQVLVCFMPFHICFAFLMLGTGENSYNPW 

GAFTTFLMNLSTCLDVILYYIVSKQFQARVISVM 

LYRNYLRSMRRKSFRSGSLRSLSNTNSEML 

3021 

A 

27 

1897 

EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQT 
KRKKPRRYWEEETVPTTAGASPGPPRNKKNREL 
RPQRPKNAYILKKSRISKKPQVPKKPREWKNPES 
QRGLSGAQDPFPGPAPVPVEVVQKFCRIDKSRKL 
PHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEE 
PGFLEGEDGEDTAKICQADIVEAVDIASAAKHFD 
LNLRQFGPYRLNYSRTGRHLAFGGRRGHVAALD 
WVTKKLMCEINVMEAVRDIRFLHSEALLAVAQN 
r ;RWLHI YDNQGIELHCIRRCDR VTRLEFLPFHFLLA 
TASETGFLTYLDVSVGKIVAALNARAGRLDVMS 
QNPYNAVIHLGHSNGTVSLWSPAMKEPLAKILC 
HRGGVRAVAVDSTGTYMATSGLDHQLKIFDLRG 
TYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDVV 
NIWAGQGKASPPSLEQPYLTHRLSGPVHGLQFCP 
FEDVLGVGHTGGITSMLVPGAGEPNFDGLESNPY 
RSRKQRQEWEVKALLEKVPAELICLDPRALAEV 
DVISLEQGKXEQIERLGYDPQAKAPFQPKPKQKG 
RSSTASLVKJIKRKVMDEEHRDKVRQSLQQQHH 
KEAKAKPTGARPSALDRFVR 

3022 

A 

1 

2249 

MTAQDSNTSAHAQRDGPELPASSSWRSFWPLSC 

LSSPPVSAVEVATEGRDREVAKVGQRFCDTTSGE 

LRQARDRDCCVRMPAPVGRRSPPSPRSSMAAVA 

LRDSAQGMTFEDVAIYFSQEEWELLDESQRFLYC 

DVMLENFAHVTSLGYCHGMENEAIASEQSVSIQ 

VRTSKGNTPTQKTHLSEIKMCVPVLKDILPAAEH 

QTTSPVQKSYLGSTSMRGFCFSADLHQHQKHYN 

EEEPWKRKVDEATFVTGCRFHVLNYFTCGEAFP 

APTDLLQHEATPSGEEPHSSSSKHIQAFFNAKSYY 

KWGEYRKASSHKHTLVQHQSVCSEGGLYECSK 

CEKAFTCKNTLVQHQQIHTGQKlvIFECSECEESFS 

KKCHLILHKI1HTGERPYECSDREKAFIHKSEFIHH 

QRRHTGGVRHECGECRKTFSYKSNLIEHQRVHT 

GERPYECGECGKSFRQSSSLFRHQRVHSGERPYQ 

CCECGKSFRQ1FNLIRHRRVHTGEMPYQCSDCGK 

SFSCKSELIQHQRJHSGERPYECRECGKSFRQFSN 

LIRHRSIHTGDRPYECSECEKSFSRKFILIQHQRVH 

TGERPYECSECGKSFTRKSDLIQHRRIHTGTRPYE 

GSECGKSFRQRSGLIQHRRLHTGERPYECSECGK 

SFSQSASLIQHQRVHTGERPYQCCECGKSFRQIFN 

LIRHRRVHTGEMPYQCSDCGKSFSCKSELIQHRRI 

HSGERPYECSECGKSFSRKSNLIRHRRVHTEERP 

3023 

A 

3148 

634 

AAGALRCLAAFPRAEPASRGRQSSPARACAASR 

AERATAAAMAHRCLRLWGRGGCWPRGLQQLL 

VPGGVGPGEQPCLRTLYRFVTTQARASRNSLLTD 

IIAAYQRFCSRPPKGFGKYFPNGKNGKKASEPKE 

VMGEKKESKPAATTRSSGGGGGGGGKRGGKKD 

DSHWWSRFQKGDIPWDDKDFRMFFLWTALFWG 

GVMFYLLLKRSGREITWKDFVNNYLSKGVVDRL 

EVWKRFVRVTFTPGKTPVDGQYVWFNIGSVDT 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 

Predicted end 
nucleotide 
iocation 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
i=isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G1utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine,\V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 





FERNLETLQQELGIEGENRVPVVYIAESDGSFLLS 

MLPTVLIIAFLLYTIRRGPAAIGRTGRGMGGLFSV 

GETTAKVLKDEIDVKFKDVAGCEEAKLEIMEFV 

NFLKNPKQYQDLGAKIPKGAILTGPPGTGKTLLA 

KATAGEANVPFITVSGSEFLEMFVGVGPARVRDL 

FALARKNAPCILFIDEIDAVGRKRGRGNFGGQSE 

QENTLNQLLVEMDGFNTTTNVVILAGTNRPDILD 

PALLRPGRFDRQIFIGPPDIKGRASIFKVHLRPLKL 

DSTLEKDKLARKLASLTPGFSGADVANVCNEAA 

LIAARHLSDSINQKHFEQAIERVIGGLEKKTQVLQ 

PEEKKTVAYHEAGHAVAGWYLEHADPLLKVSII 

PRGKGLGYAQYLPKEQYLYTKEQLLDRMCMTL 

GGRVSEEIFFGRITTGAODDLRKVTOSAYAOIVO 

FGMNEKVGQISFDLPRQGDMVLEKPYSEATARLI 

DDEVRILINDAYKRTVALLTEKKADVEKVALLL 

LEKE VLDKNDMVELLG PRPF AEKST YEEF VEGT 

G SLDEDTSLPEGLKD WNKEREKEKEEPPGEKVA 

N 

3024 

A 

274 

1455 

LRACSLPSMSALEKSMHLGRLPSRPPLPGSGGSQ 

SGAKMRMGPGRKRDFSPVPWSQYFESMEDVEV 

ENETGKDTFRVYKSGSEGPVLLLLHGGGHSALS 

WAVFTAAIISRVQCRIVALDLRSHGETKVKNPED 

LSAETMAKDVGNVVEAMYGDLPPPIMLIGHSMG 

GAIAVHTASSNLVPSLLGLCMIDVVEGTAMDAL 

NSMONFLRGRPKTFKSLENAIEWSVKSGOIRNI F 

SARVSMVGQVKQCEGITSPEGSKSIVEG1IEEEEE 

DEEGSESISKJIKKEDDMETKKDHPYTWRIELAKT 

EKYWDGWFRGLSNLFLSCPIPKLLLLAGVDRLD 

KDLTIGQMQGKFQMQVLPQCGHAVHEDAPDKV 

AEAVATFLIRHRFAEPIGGFQCVFPGC 

3025 

A 

621 

306 

YHGGQRGRAGGSFRSVQGWGGQLRNPFRTSKSL 
SWKGLSSLLFPLYNLQMGRPRDRKELGRGHSPP 
HLEGPHMLPSGAARWRWLEAPVLVLEPLVLRPA 
AAPTP 

3026 

A 

1533 

454 

AKVPQSTREEKRENGLEARSPAINLMGFNVEEM 

YEAHAWIQRlLSLQNfflHEN^ 

SQLQKTSSVSITEIISPGRTELEIEGARADLIEVVM 

NffiDMLCKVQEElVL\RKKERGLWRSLGQWTIQQ 

QKTQDEMKENIIFLKCPVPPTQELLDQKKQFEKC 

GLQVLKVEKJDNEVL1\^ 

QPVSHRLFQQVPYQFCNVVCRVGFQRMYSTPCD 

PKYGAG1YTFTKNLKNLAEKAKKJSAADKLIYVFE 

AEVLTGFFCQGHPLNIVPPPLSPGAIDGHDSVVD 

NVSSPETFVIFSGMQAIPQYLWTCTQEYVQSQDY 

SSGPMRPFAQHPWRGFASGSPVD 

3027 

A 

179 

703 

PFl^GASSNTFl^QVQTQESKAQKEVKMGFIFSK 
SlVQvfESMK^Q 

ERQMAMQIAWSREFLKYFGTFFGLAAISLTAGA1 
KKKKPAFLW1YPLSFILTYQYDLGYGTLLERMK 
GEAEDILETEKSK1QLPRGMITFESIEKARKEQSR 
FFIDK 

3028 

A ; 

876 

1226 

AVGKJEPESSSTWVRDREGmRSl^SMKMLWKLT 
DNIKYEDCEVSATPARSSVRSQAPSLTLPLLLLSL 
QPAAKRGWDKLSPAQRPSLGFARRTRGRSCRER 
TWMLPSLVSEFLHRD 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, MNMethionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threoninc, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 

3029 

A 

3 

1731 

FREGRFGSSCAVAAPLAGFQGLIECGYLAVDSPP 

SCWTPGGSNPAAPLPQALLPPRLPPTVLPFLGPGL 

SGELEMFTLPQKDFRAPTTCLGPTCMQDLGSSHG 

EDLEGECSRKLDQKLPELRGVGDPAMISSNTSYL 

SSRGRMIKWFWDSAEEGYRTYHMDEYDEDKNP 

SGIINLGTSENKLCFDLLSWRLSQRDMQRVEPSL 

LQYADWRGHLFLREEVAKFLSFYCKSPVPLRPE 

NVVVLNGGASLFSALATVLCEAGEAFLIPTPYYG 

A1TQHVCLYGNIRLAYVYLDSEVTGLDTRPFQLT 

VEKLEMALREAHSEGVKVKGLELISPQNPLGDVY 

SPEELQEYLVFAKRHRLHVIVDEVYMLSVFEKSV 

GYRSVLSLERLPDPQRTHVMWATSKDFGMSGLR 

FHT1 YTFNOHVATA VA9T PRVWHT <iiTr VDVnM 

AQLLRDRDWINQVYLPENHARLKAAHTYVSEEL 

RALGIPFLSRGAGFFIWVDLRKYLLKGTFEEEML 

LWRRFLDNKVLLSFGKAFECKEPGWFRFVFSDQ 

VHRLCLGMQRVQQVLAGKSQVAEDPRPSQSQEP 

SDQRR 

3030 

A 

1 

JOt 

VFSTSSLMLALSRHSLLSPLLSVTSFRRFYRGDSP 

TDSQKDMIEIPLPPWQERTDESIETKRARLLYESR 

KRGMLENCILLSLFAKEHLQHMTEKQLNLYDRLI 

NEPSNDWDIYYWATEAKPAPEEFENEVMALLRD 

FAKNKNKEQRLRAPDLEYLFEKPR 

3031 

A 

1177 

359 

SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSL 
CWSSCGQHPVQATHRGAVSNSLMLCELKLASQM 
PLENTTVQQMVFMLLSNLALSHDCKGVIQKSNF 

DGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLI 

FHKVCFSPANKPK1LANEKVITVLAACLESENQN 

AQRIGAAALWALIYNYQKAKTALKSPSVKRRVD 

EAYSLAKKTFPNSEANPLNAYYLKCLENLVQLL 

NSS 

3032 

A 

2 

1242 

GISGRPPRPAKRRMGKNPVRPPRALPPVPSQDDIP 
LSRPKKKKPRTKNTPASASLEGLAQTAGRRPSEG 
NEPSTKELKEHPEAPVQRRQKKTRLPLELETSST 
QKKSS S S SLLRNENG1DAEP AEEA VIQKPRRKTK 
KTQPAELQYANELGVEDEDIfTDEQTTVEQQSVF 
TAPTGISQPVGKVFVEKSRRFQAADRSELIKTTEN 
IDVSMDVKPSWTTRDVALTVHRAFRMIGLFSHG 
FT AGOAVWNTWTYVT AGHOT <?NT ^MT 1 OOVK'T 

LAYPFQSLLYLLLALSTISAFDRIDFAKISVAIRNF 
LALDPTALASFLYFTALILSLSQQMTSDRIHLYTP 
SSVNGSLWEAGIEEQILQPWIVVNLWALLVGLS 
WLFLSYRPGMDLSEELMFSSEVEEYPDKEKEIKA 
SS 

3033 

A 

3 

1436 

TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQ 

VRDTSSRIAKGGVDHTKMSLHGASGGHERSRDR 

RRSSDRSRDSSHERTESQLTPCIRNVTSPTRQHHV 

EREKDHSSSRPSSPRPQKASPNGSISSAGNSSRNS 

SQSSSDGSCKTAGEMVFVYENAKEGARNIRTSER 

VTLIVDNTRFVVDPSIFTAQPNTMLGRMFGSGRE 

HNFTRPNEKGEYEVAEGIGSTVFRAILDYYKTGII 

RCPDGISIPELREACDYLCISFEYSTIKCRDLSALM 

HELSNDGARRQFEFYLEEMJLPLMVASAQSGERE 


234 


WO 01/57190 


PCT/USO 1/04098 


SEQID 
NO: 

Method 

Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

SCLjUCllLC 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=Glycine, H=Histidinc, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y—Tyrosinc, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





CHIVVLTDDDVVDWDEEYPPQMGEEYSQIIYSTK 

LYRFFKYIENRDVAKSVLKERGLKKIRLGIEGYP 

TYKEKVKKRPGGRPEVIYNYVQRPFIRMSWEKE 

EGKSRHVDFQCVKSKSITNLAAAAADIPQDQLV 

VMHPTPQVDELDILPIHPPSGNSDLDPDAQNPML 

3034 

A 

3 

1972 

SSLAQHRSVAVLGWPAGWAAARARPAMQGGN 

SGVRKREEEGDGAGAVAAPPAIDFPAEGPDPEY 

DESDVPAEIQVLKEPLQQPTFPFAVANQLLLVSL 

LEHLSHVHEPNPLRSRQVFKLLCQTFIKMGLLSSF 

TCSDEFSSLRLHHNRA1THLMRSAKERVRQDPCE 

DISRIQKIRSREVALEAQTSRYLNEFEELAiLGKG 

GYGRVYKVRNKLDGQYYAIKKILIKGATKTVCM 

KVLREVKVLAGLQHPNIVGYHTAWIEHVHVIQP 

RADRAAIELPSLEVLSDQEEDREQCGVKNDESSS 

SSIIFAEPTPEKEKRFGESDTENQNNKSVKYTTNL 

VIRESGELESTLELQENGLAGLSASSIVEQQLPLR 

RNSHLEESFTSTEESSEENVNFLGQTEAQYHLML 

HIQMQLCELSL WD WI VERNKRGREY VDESACP Y 

VMANVATK1FQELVEGVFY1HNMGIVHRDLKPR 

NGKRTPTHTSRVGTCLYASPEQLEGSEYDAKSD 

MYSLGVVLLELFQPFGTEMERAEVLTGLRTGQL 

PESLRKRCPVQAKYIQHLTRRNSSQRPSAIQLLQS 

ELFQNSGNVNLTLQMKIIEQEKEIAELKKQLNLL 

SQDKGVRDDGKDGGVG 

3035 

A 

110 

1172 

KLSCPCSHGTRVTAVRGPRLKAGVQWHDLGSLQ 

PPPSGLKQSSHLSLSSSWDFRHAPTHPETYTCPK 

MIEMEQAEAQLAELDLLASMFPGENELIVNDQL 

AVAELKDCIEKKTMEGRSSKVYFTDvMNLDVSD 

EKMAMFSLACILPFKYPAVLPEITVRSVLLSRSQQ 

TOT NTDT TA FT OKHCHGTWrTT NATFWVRFTTAS 

GYVSRDTSSSPTTGSTVQSVDLIFTRLWIYSHHIY 

NKCKRKNILEWAKELSLSGFSMPGKPGVVCVEG 

PQSACEEFWARLRKLNWKRILIRHREDIPFDGTN 

DETERQRKFSIFEEKVFSVNGARGNHMDFGQLY 

QFLNTKGCGDVFQMFLWV 

3036 

A 

1 

2288 

FRFAERRAAAAESDVSAKMAGRSMQAARCPTD 

ELSLTNCAVWEKDFQSGQIWIVRTSPNHRYTFT 

LKTHPSVVPGSIAFSLPQRKWAGLS1GQEIEVSLY 

TFDKAKQCIGTMTIEIDFLQKKSIDSNPYDTDKM 

AAEFIQQFNNQAFSVGQQLVFSFNEKLFGLLVKD 

IEAMDPSILNGEPATGKRQKIEVGLVVGNSQVAF 

EKAENSSLNLIGKAKTKENRQSIINPDWNFEKMG 

IGGLDKEFSDIFRRAFASRVFPPEIVEQMGCKHVK 

GILLYGPPGCGKTLLARQIGKMLNAREPKVVNG 

PEILNKYVGESEANIRKLFADAEEEQRRLGANSG 

LH1IIFDEIDAICKQRGSMAGSTGVHDTVVNQLLS 

KIDGVEQLNNILVIGMTNRPDLIDEALLRPGRLEV 

KMEIGLPDEKGRLQILHIHTARMRGHQLLSADV 

DIKELAVETKNFSGAELEGLVRAAQSTAMNRHI 

KASTKVEVDMEKAESLQVTRGDFLASLENDIKP 

AFGTNQEDYASYIMNGIIKWGDPVTRVLDDGEL 

LVQQTKNSDRTPLVSVLLEGPPHSGKTALAAKIA 

EESNFPFIKICSPDKMIGFSETAKCQAMKKIFDDA 

YKSQLSCVVVDDIERLLDYVPIGPRFSNLVLQAL 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=Phenylalantne, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=0Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





LVLLKKAPPQGRKLL1IGTTSRKDVLQEMEMLNA 
FSTTIHVPNIATGEQLLEALELLGNFKDKERTTIA 
QQVKGKKVWIGIKKLLMLIEMSLQMDPEYRVRK 
FLALLREEGASPLDFD 

3037 

A 

1 

1347 

MLDTGSEHLNRILKALPALQSAGSEGQNGSAESL 

GEGGTRDSDRARRKLRGGNKEIPTFYPCLVVRSP 

VTASDLRGTQDFAAYHGLSLILEPLGACNRLSVC 

VPVHSPPGMRVSPRSPSLRTLVIDPAEPAGAQRL 

RFSGKERSGEAGSAVEGLAVAVSMGDGGAERD 

RGPARRAESGGGGGRCGDRSGAGDLRADGGGH 

SPTEVAGTSASSPAGSRESGADSDGQPGPGEADH 

CRRILVRDAKGTIREIVLPKGLDLDRPKRTRTFFT 

AEQLYRLEMEFQRCQYVVGRERTELARQLNLSE 

TQVKVWFQNRRTKQKKDQSRDLEKRASSSASEA 

FATSNILRLLEQGRLLSVPRAPSLLALTPSLPGLP 

ASHRGTSLGDPRNSSPRLNPLSSASASPPLPPPLP 

AVCFSSAPLLDLPAGYELGSSAFEPYSWLERKVG 

SASSCKKANT 

3038 

A 

924 

501 

TELLPLCSRSGPKPQSGDPLLQLAQQARPRLSGE 

RLETAPSLLLSRMACVISGWALSRGARTWTWAT 

PTGPVHRAQPAIRSLSAEGALTRLKEEKWPGRYI 

LPNHLTPPFLYKHLGSVPPSHWRSPHSHSVNILA 

LNWR 

3039 

A 

1263 

111 

ACGIRHEGALPGLTATPEAMLRFLPDLAFSFLL1L 

ALGQAVQFQEYVFLQFLGLDKAPSPQKFQPVPYI 

LKKIFQDREAAATTGVSRDLCYVKELGVRGNVL 

RFLPDQGFFLYPKKISQASSCLQKLLYF>JLSAIKE 

REQLTLAQLGLDLGPNSYYNLGPELELALFLVQE 

PHVWGQTTPKPGKMFVLRSVPWPQGAVHFNLL 

DVAKDWNDNPRKNFGLFLEILVKEDRDSGVNFO 

PEDTCARLRCSLHASLLVVTLNPDQCHPSRXRRA 

AIPWKLSCKNLCHRHQLFINFRDLGWHKWIIAP 

KGFMANYCHGECPFSLTISLNSSNYAFMQALMH 

AVDPEIPQAVCIPTKLSPISMLYQDNNDNVILRHY 

EDMVVDECGCG 

3040 

A 

15 

849 

ASRLPRGPGCGADMRPLLGLLLVFAGCTFALYL 

LSTRLPRGRRLGSTEEAGGRSLWFPSDLAELREL 

SEVLREYRKEHQAYVFLLFCGAYLYKQGFAIPGS 

SFLNVLAGALFGPWLGLLLCCVLTSVGATCCYL 

LSSIFGKQLVVSYFPDKVALLQRKVEENRNSLFF 

FLLFLRLFPMTPNWFLNLSAPILNIPIVQFFFSVLI 

GLIPYNFICVQTGSILSTLTSLDALFSWDTVFKLL 

AIAMVALIPGTLIKKFSQKHLQLNETSTANHIHSR 

KDT 

3041 

A 

1015 | 

175 

GLKRRRLCFAKVGDVLGCLSLPPSRSARVLEDISI 

LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 

KGLSGKRTKTENSGEALAKVEDSNPQKTSATKN 

CLKNLSSHWLMKSEPESRLEKGVDVKFSIEDLKA 

QPKQTTCWDGVRNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKJVKEAYPDHTQFEKNNPHY 

DPSSKEDNPKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 

3042 

A 

1015 

175 

GLKRKRLCFAKVGDVLGCLSLPPSRSARVLEDISI 
LSCISVDSRIVRTKVPCSVTMSRPRKRLAGTSGSD 
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SEQ ro 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, MHYlethionine, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





KGLSGICRTKTENSGEALAKVEDSNPQKTSATKN 

T^-KTT OPTTIirr » rtr OT7TIT70 1> T EM/ - /*"» \1XW t\f X?Q\XZX~W \f A 

CLKNLSSH WLMKSbPbbKLbKAj V u V Rr MbULRA 

QPKQTTCWDGVRNYQARNFLRAMKLGEEAFFY 

HSNCKEPGIAGLMKIVKEAYPDHTQFEKNNPHY 

DPSSKEDWKWSMVDVQFVRMMKRFIPLAELKS 

YHQAHKATGGPLKNMVLFTRQRLSIQPLTQEEF 

DFVLSLEEKEPS 

3043 

A 

153 

1133 

VGTAPAPGGRDRAPAMGSFQLEDFAAGWIGGA 

ASVIVGHPLDTVKTRLQAGVGYGNTLSCIRVVY 

RRESMFGFFKGMSFPLASIAVYNSVVFGVFSNTQ 

RFLSQHRCGEPEASPPRTLSDLLLASMVAGVVSV 

GLGGPVDLIKIRLQmQ 1 QrrKJJANLGLKbKA V Ar 

AEQPAYQGPVHCITTIVRNEGLAGLYRGASAML 

LRDVPGYCLYFIPYVFLSEW1TPEACTGPSPCAV ! 

WLAGGMAGAISWGTATPMDVVKSRLQADGVY 

LNKYKGVLDCISQSYQKEGLKVFFRGITVNAVR 

GFPMSAAMFLGYELSLQAIRGDHAVTSP 

3044 

A 

41 

1316 

PPLGAGAGIHARSPHPARRLRLTAAGVGGRASG 

LLPTPWRRHHGPSGAAPYPAARLWQGPWRCRR 

PQPMAQRYDELPH YPGI ADGPA ALAGFPEA V PA 

APGPYGPHRPPQPLPPGLDSDGLKRDKDEIYGHP 

LFPLLALGFEKCELATCSPRDGAGAGLGTPRGGD 

VCSSDSFNEDNTAFAKQVCSERPFSSNPELDNLM 

IQAIQVLRFHLLELEKGKMPIDLVIEDRDGGCRE 

DFEDYPAPCPSLPDQNNIWIRDHEDSGSVHLGTP 

GPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGE 

DEDLDQEPRRNKKRGIFPKVATNIMRAWLFQHL 

SHPYPSEEQKKQLAQDTGLT1LQVNNWF[NARRR 

IVQPMIDQSNRTGQGAAFSPEGQPIGGYTETEPH 

VAFRAPASVGMSLNSEGEWHYL 

3045 

A 

3 

967 

VAHTQWHTCQRLSQLTHRSILKYLL1DTHACQV 

LILKHTHASLSLPSCQECFPSSIPSASHMVSHPHPP 

PSPRWGQTPEGLPAASPCGPGPRSCFSSILPTGDS 

WGMLACLCTVLWHLPAVPALNRTGDPGPGPSIQ 

KTYDLTRYLEHQLRSLAGTYLN YLGPPFN bPDr N 

PPRLGAETLPRATVDLEVWRSLNDKLRLTQNYE 

AYSHLLCYLRGLNRQAATAELRRSLAHFCTSLQ 

GLLGSIAGVMAALGYPLPQPLPGTEPTWTPGPAH 

SUbLQKMDDr W LbKbbl^ 1 W JL WKo AJvUr i\KJLivlv 

KMQPPAAAVTLHLGAHGF 

3046 

A 

1185 

1584 

MYAYMYICTfflCICAYRGIHlDVYLYMCIYIHIWI 
HTYLCVHIYVYVYICTHICMCIHTYVYVYTYMY 
VYTY1CLCVYICLCVHIYLCVYIHMYMCTHICMC 
IHTYVHMCICVYIHMYTCVYVYTYTCVY1V1Y 

3047 

A 

811 

132 

SLDLLGPIGILQEGRDPGTQGPQEKEKQMPASPM 

NTOAHLDIOTKEGLKKERS YTuQr bAN VKL/bbK 

QCGCGVVPDSLLMKVLSQRLDQQDCIQKGWVL 

HGVPRDLDQAFDLLNRLGYNPNREFFLNVPFDSI 

MERLTLRRIDPVTGERYHLMYKPPPTMEIQARLL 

OMPT^DAFFOVKT KMDLFYRNSADLEOLYGSAIT 

LNGDQDPYTVFEY1ESGIINPLPKKIP 

3048 

A 

2 

1166 

RPRRGQGLVQEVQTENVTVAEGGVAEITCRLHQ 
YDGSIWIQNPARQTLFFNGTRALKDERFQLEEFS 
PRRVRIRLSDARLEDEGGYFCQLYTEDTHHQIAT 
LTVLVAPENPVVEVREQAVEGGEVELSCLVPRSR 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Gfutamic Acid, F=Pheny1alanine, C=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 





PAATLRWYRDRKELKGVSSSQENGKVWSVAST 

VRFRVDRKDDGGI1ICEAQNQALPSGHSKQTQYV 

LDVQYSPTARIHASQAWREGDTLVLTCAVTGN 

PRPNQIRWNRGNESLPERAEAVGETLTLPGLVSA 

DNGTYTCEASNKHGHARALYVLVVYGESRLRPT 

EGGGGAPDPGAVVEAQTSVPYAIVGGILALLVFL 

HCVLVGMVWCSVRQKGSYLTHEASGLDEQGEA 

REAFLNGSDGHKRKEEFFI 

3049 

A 

3159 

882 

VGCTLRVGVMAAAGSRKRRLAELTVDEFLASGF 

DSESESESENSPQAETREAREAARSPDKPGGSPSA 

SRRKGRA SEHKDQLSRLKDRDPEFYKFLQENDQ 

SLLNFSDSDSSEEEEGPFHSLPDVLEEASEEEDGA 

EEGEDGDRVPRGLKGKKNSVPVTVAMVERWKQ 

AAKQRLTPKLFHEVVQAFRAAVATTRGDQESAE 

ANKFQVTDSAAFNALVTFCIRDLIGCLQKLLFGK 

VAKDSSRMLQPSSSPLWGKLRVDIKAYLGSAIQL 

VSCLSETTVLAAVLRHISVLVPCFLTFPKQCRML 

LKRM V V V WSTGEESLRVLAFL VLSRVCRHKKDT 

FLGPVLKQMYITYVRNCKFTSPGALPFISFMQWT 

LTELLALEPGVAYQHAFLYIRQLAIHLRNAMTTR 

KKETYQSVYNWQYVHCLFLWCRVLSTAGPSEA 

LQPLVYPLAQVIIGCIKLIPTARFYPLRMHCIRALT 

LLSGSSGAFIPVLPFILEMFQQVDFNRKPGRMSSK 

PINFSVILKLSNVNLQEKAYRDGLVEQLYDLTLE 

YLHSQAHCIGFPELVLPVVLQLKSFLRECKVANY 

CRQVQQLLGKVQENSAYICSRRQRVSFGVSEQQ 

AVEAWEKLTREEGTPLTLYYSHWRKLRDREIQL 

E1SGKERLEDLNFPEIKRRKMADRKDEDRKQFKD 

LFDLNSSEEDDTEGFSERGILRPLSTRHGVEDDEE 

DEEEGEEDSSNSEDGDPDAEAGLAPGELQQLAQ 

GPEDELEDLQLSEDD 

3050 

A 

870 

182 

HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTM 

GCCGCSRGCGSGCGGCGSSCGGCGSGCGGCGSG 

RGGCGSGCGGCSSSCGGCGSRCYVPVCCCKPVC 

SWVPACSCTSCGSCGGSKGGCGSCGGSKGGCGS 

CGCSQSSCCKPCCCSSGCGSSCSQSSCCKPCCCSS 

GCGSSCCQSSCCKPYCCQSSCCKPCSCFSGCGSS 

CCQSSCYKPCCCQSSCCVPVCCQCKI 

3051 

A 

175 

4330 

NIPRWNFQGKSFGVVLVHFSSEEVDMASDSPARS 

LDEIDLSALRDPAG1FELVELVGNGTYGQVYKGR 

HVKTGQLAAIKVMDVTGDEEEEIKQEINMLKKY 

SHHRNIATYYGAFIKKNPPGMDDQLWLVMEFCG 

AGSVTDLIKNTKGYTLKEEWIAYICREILRGLSHL 

HQHKVIHRDIKGQNVLLTENAEVKLVDFGVSAQ 

LDRTVGRRNTFIGTPYWMAPEY1ACDENPDATY 

DFKSDLWSLGITAIEMAEGAPPLCDMHPMRALF 

LIPRNPAPRLKSKKWSKKFQSFIESCLVKNHSQRP 

ATEQLMKHPFIRDQPNERQVRJQLKDHIDRTKKK 

RGEKDETEYEYSGSEEEEEENDSGEPSSILNLPGE 

S I LRRDr LRl/QLANKfcKbbALRR(JQLfaQyQRbN 

EEHKRQLLAERQKRIEEQKEQRRRLEEQQRREKE 

LRKQQEREQRRHYEEQMRREEERRRAEHEQEYI 

RRQLEEEQRQLEILQQQLLHEQALLLEYKWCQLE 

EQRQAERLQRQLKQERDYLVSLQHQRQEQRPVE 

KKPLYHYKEGMSPSEKPAWAKEVEERSRLNRQS | 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Le urine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion 





SPAMPHKVANRISDPNLPPRSESFSISGVQPARTP 

PMLRPVDPQIPHLVAVKSQGPALTASQSVHEQPT 

KGLSGFQEALNVTSHRVEMPRQNSDPTSENPPLP 

TRIEKFDRSSWLRQEEDIPPKVPQRTTSISPALAR 

KNSPGNGSALGPRLGSQPIRASNPDLRRTEPILES 

PLQRTSSGSSSSSSTPSSQPSSQGGSQPGSQAGSSE 

RTRVRANSKSEGSPVLPHEPAKVKPEESRDITRPS 

RPASYKJCAIDEDLTALAKELRELRJEETORPMKK 

VTDYSSSSEESESSEEEEEDGESETHDGTVAVSDI 

PRLIPTG APGSNEQ YN VG MVGTHGLETSHADSFS 

GSISREGTLMIRETSGEKKRSGHSDSNGFAGHINL 

PDL VQQSHSPAGTPTEG LGR VSTHSQEMDSGTE 

YGMGSSTKASFTPFVDPRVYQTSPTDEDEEDEES 

SAA ALFTSELLRQEQ AKLNE A RKIS V VNVNPTNI 

RPHSDTPEIRKYKKRFNSEILCAALWGVNLLVGT 

ENGLMLLDRSGQGKVYNLINRRRFQQMDVLEG 

LNVLVTISGKJGvfKLRVYYLSWLRNRILHNDPEV 

EKKQGW1TVGDLEGCIHYKVVKYERIKFLVIALK 

NAVFTYAWAPKPYHKFMAFK^FAni rH-TRTPT T Vn 
xyr\ v m>v aj rvr i nivr ivi/-vr Ivor r\ LJl^K^nJ\*i 1jL» V U 

LTVEEGQRLKVIFGSHTGFHVIDVDSGNSYDIYIP 
SHIQGNITPHA I V I L PKTDG MEML VC YEDEG V YV 
NTYGRITKDVVLQWGEMPTSVAYIHSNQIMGW 
GEKAIEIRSVETGHLDGVFMHKRAQRLKFLCERN 
DKVFFASVRSGGSSQVFFMTLNRNSMMNW 

3052 

A 

1 

615 

MGQVECGGQKLGNQLEDDSEPAEGKVYSSDEE 
KT FASAGDPAOSROFFFnSfrfjn^FrYnfiFT n^<?A 

GGPGALLGPKPKLKGSLGTGAEEGAPVTAGVTA 
PGGKSRRRRTAFTSEQLLELEKEFHCKKYLSLTE 
RSQIAHALKLSEVQVKIWFQNRRAKWKRDCAGN 
VSSRSGEPVRNPKIVVPIPVHVNRFAVRSQHQQM 
EQGARP 

3053 

A 

203 

2167 

FGVRVPSNTQCLVPSFHCMQTSEWDSECLTSLQP 

LPLPTPPAANEAHLQTAAISLWTVVAAVQAIERK 

VEIHSRRLLHLEGRTGTAEKKLASCEKTVTELGN 

QLEGKGAVLGTLLQEYGLLQRRLENLENLLRNR 

NFWILRLPPGIKGDIPKVPVAFDDVSIYFSTPEWE 

KLEEWQKELYKNIMKGNYESLISMDYAINQPDV 

LSQIQPEGEHNTEDQAGPEESEEPTDPSEEPGISTS 

DILSWIKQEEEPQVGAPPESKESDVYKSTYADEE 

LVIKAEGLARSSLCPEVPVPFSSPPAAAKDAFSDV 

AFKSQQSTSMTPFGRPATDLPEASEGQVTFTQLG 

SYPLPPPVGEQVFSCHHCGKNLSQDMLLTHQCS 

HATEHPLPCAQCPKHFTPQADLSSTSQDHASETP 

PTCPHCARTFTHPSRLTYHLRVHNSTERPFPCPDC 

PKRFADQARLTSHRRAHASERPFRCAQCGRSFSL 

KJSLLLHORGHAOERPFSCPOCGIDFNGHSAT IRH 

QMIHTGERPYPCTDCSKSFMRKEHLLNHRRLHT 

GERPFSCPHCGKSFIRKFMLMKHQRIHTGERPYP 

CSYCGRSFRYKQTLKDF1LRSGHNGGCGGDSDPS 

GQPPNPPGPLITGLETSGLGVNTEGLETNQWYGE 

GSGGGVL 

3054 

A 

3 

2212 

SCGHKSAYGSYTGLQLFWEDGQELLQHQQLQD 
LRLCVHLRPQSEKVELSLWTLFWGKGEPSAVR 
EKLGKAGFAAASGPGGRPGAERASTVLNILHLT 
AESRWEPNACNRVSSSPAGVGPLDLPVGPLLYFF 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
[=lsoieucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q^GIutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





APWARASFLCHAFQRPLTGIGLNTVRFTSEFPLH 

SKDPTAHKLLFTGNYLCKLHPRPRHAPQGSLSDF 

CHGTEGKDLPSEHNVSVEGVAQDRSPEATLCPQ 

KTCPCDICGLRLKDILHLAEHQTTHPRQKPFVCE 

AYVKGSEFSANLPRKQVQQNVHNPIRTEEGQAS 

PVKTCRDHTSDQLSTCREGGKDFVATAGFLQCE 

VTPSDGEPHEATEGVVDFHIALRHNKCCESGDAF 

NNKSTLVQHQRIHSRERPYECSKCGIFFTYAADL 

TQHQKVHNRGKPYECCECGKFFSQHSSLVKHRR 

VHTGESPHVCGDCGKFFSRSSNLIQHKRVHTGEK 

PYECSDCGBCFFSQRSNLIHHKRVHTGRSAHECSE 

CGKSFNCNSSLIKHWRVHTGERPYKCNECGKFFS 

HQRVHTGERPYECNECGKLFSQSSSLNSHRRLHT 

GERPYQCSECGKJTOQSSSLIWEURRLHTGERPYE 

CSECGKTFRQRSNLRQHLKVHKPDRPYECSECG 

KAFNQRPTLIRHQKmiRERSMENVLLPCSQHTPE 

ISSENRPYQGAVNYKLKLVHPSTHPGEVP 

3055 

A 

268 

2954 

ARRSSSSQGSAAPTPCQWEASRDQLVAGPSGK 

MGiNfREMEELIPLVNRLQDAFSALGQSCLLELPQI 

AVVGGQSAGKSSVLENFVGRDFLPRGSGIVTRRP 

LVLQLVTSKAEYAEFLHCKGKKFTDFDEVRLEIE 

AETDRVTGMNKGISSIPINLRVYSPHVLNLTLIDL 

PGITKVPVGDQPPDIEYQIRMIMQFiTRENCLILA 

VTPANTDLANSDALKLAKEVDPQGLRTIGVITKL 

DLMDEGTDARDVLENKLLPLRRGYVGVVNRSQ 

KDIDGKKD1XAAMLAERKFFLSHPAYRHIADRM 

GTPHLQKVLNQQLTNHIRDTLPNFRNKLQGQLLS 

ffiHEVEAYI<Q^FKPEDPTRKTKALLQMVQQFAVD 

FEKRIEGSGDQVDTLELSGGAKINR1FHERFPFEIV 

KME1^EK£LRJR£ISYA1XNIHGIRTGLFTPDMAFE 

AIVKXQIVKLKGPSLKSVDLVIQELIT^TVKKCTK 

KLANFPRLCEETERIVANHIREREGKTKDQVLLLI 

DIQVSYINTNHEDFIGFANAQQRSSQVHKKTTVG 

NQVIRKGWLTISNIGIMKGGSKGYWFVLTAESLS 

WYKDDEEKEKKYMLPLDNLKVRDVEKSFMSSK 

HDFALFNTEQRNVYKDYRFLELACDSQEDVDSW 

KASLLRAGVYPDKSVGNNKAENDENGQAENFS 

IV11)PQLERQVETIRI>JLVDSYMSIINKCIRDLIPKTI 

]VlHLMIiSnSrVKI)FiNSFLT AOI YSSFDONTT MFF<? 

AEQAQRRDEMLRMYQALKEALGIIGDIGTATVS 

TPAPPPVDDSWIOHSl^SPPPSPfTORRPTLSAPL 

ARPTSGRGPAPAIPSPGPHSGAPPVPFRPGPLPPFP 

SSSDSFGAPPOWSRFTIIAPPSVPSRRPPPSPTRPTI 

IRPLESSLLD 

3056 

A 

1674 

1839 

VVRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 

3057 

A 

1674 

1839 

VVRVTCCPPARSTTERTNAYDEEDCVEMVASGG 
WNDVACHTTMYFMCEFDKKNM 

3058 

A 

3363 

2525 

FLVKLILIILCRCLHSLSRSVQQLRTSFQDHAVWK 

PLMKVLQNAPDEELWASSMLCNLLLEFSPSKEPI 

LESGAVELLCGLTQSENPALRVNGIWALMNMAF 

QAEQKIKADILRSLSTEQLFRLLSDSDLNVLMKT 

LGLLRNLLSTRPHIDKIMSTHGKQIMQAVTLrLEG 

E1WIEVKEQTLCILANIADGTTAKDLIMTNDDILQ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenyla!anine, G=Glycine, H-Histidinc, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





KIKYYMGHSHVKLQLAAMFCISNLIWNEEEGSQ 
ERQDKXRDMGIVDILHKLSQSPDSNLCDKAKMA 
I OOYLA 

3059 

A 

679 

167 

SSWPSLSSQMHFPSFHLHVAAHYGRDSFVRLLLE 
FKAEVDPLSDKGTTPLQL A I IRERSSC VKILLDHN 
ANIDIONGFLLRYAVIKSNHSYCRMFT ORGADTN 

> ^ VJ X. I,,«J./JL\. X i \ V 1 IV O I > JL JL vj k \sk.\±YXX A-*\^1\.VJ IlJLS X 1^1 

LGRLEDGQTPLHLSALRDDVLCARMLYNYGAD 
TNTRNYEGQTPLAVSISISGSSRPCLDFLQEVTSM 

3060 

A 

30 

234 

ppt ot DMOPNCYCAnnn^rTrAG^rKCKFrKPT 

SCKKSCCSCCPAGCAKCAQGCICKGATDKCSCC 

A 

3061 

A 

428 

720 

VRRDVRQQATWAMASDLDFSPPEVPEPTFLENL 

LRYGLFLGAIFQLICVLAIIVPIPKSHEAEAEPSEPR 

SAEVTRKPKAAWSVNKlvPKKETKKKR 

3062 

A 

1589 

276 

WKQKYEPLGLDAAGIEEAITAVGSFILKANELLQ 

VIDSSMKNFKAFFRWLYVAMLRMTEDHVLPELN 

KMTQKDITFVAEFLTEHFNEAPDLYNRKGKYFN 

VERVGQYLKDEDDDLVSPPNTEGNQWYDFLQN 

SSHLKESPLLFPYYPRKSLHFVKRRMENIIDQCLQ 

KPAD VIG KSMNQ A ICrPL YRDTRS EDSTRRLFKFP 

FLWNNKTSNLHYLLFTILEDSLYKMCILRRHTDIS 

V^o VolNOJLlAllsJrUolM iAi 1 JclvVlvKol Y oLLUA^r 

YDDETVTVVLKDTVGREGRDRLLVQLPLSLVYN 
SEDSAEYQFTGTYSTRLDEQCSAIPTRTMHFEKH 
WRLLESMKAQYVAGNGFRKVSCVLSSNLRHVR 
VFEMDIDDEWELDESSDEEEEASNKPVKIKEEVL 
SESEAENQQAGAAALAPEIVIKVEKLDPELDS 

3063 

A 

50 

849 

DKMPSIFAYQSSEVDWCESNFQYSELVAEFYNTF 

SN1PFFIFGPLMMLLMHPYAQKRSRYIYVVWVLF 

ivijj.vxL»r oivi i rrnvi i j^ori^Ov^JLi^iJij<i/\JLL. w l.JLvj ovj i o 

IWMPRCYFPSFLGGNRSQFIRLVF1TTVVSTLLSFL 

RPTVNAYALNSIALHILYIVCQEYRKTSNKELRH 

LIEVSVVLWAVALTSWISDRLLCSFWQRIHFFYL 

HSIWHVLIS1TFPYGMVTMALVDANYEMPGETL 

KVRYWPRDSWPVGLPYVEIRGDDKDC 

3064 

A 

X J/ — J 


DDGFGMDPFPDDLTASWPDWALPRLSSAWPGTL 

RSGMVPRGPTATARFGVPAEGRTPPPFPGEPWK 

VCVNVHSFKPEELMVKTKDGYYEVSGKJ^ 

QEGGIVS1<MTKX1QLPAEVDPVTVFASLSPEGLL 

11EAPQVPPYSTFGESSFNNELPQDSQEVTCT 

3065 

A 

230 

2929 

LSTSLTGSHLFSLG^mSTRENLNAGNFWPSEGH 

L VRSTGPGGSFAKHMV AQCVSPKGPLACSRTYF 

FGATHVPYLGGDSKLPKKTEQIRLLSQIYAAVIE 

AVLAGIACYAKTSSLTKAKEVAEQTLGSGLDSFE 

L1TF1<j^LRSKMTFHIHAVNNQGRIVPLDSEDSLS 

FVKTACMAVYDPDLLGGNGCLGSVVFSESFLTS 

Q1LVKEKDGTVTTETSSVVLTAAVPRFCSWLVED 

NEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLY 

SSiNrLOSWPEEGNVHFFSSGLLFSHCRHGSlIISKD 

HIVINSISFYDGDSTSTVAALLIDFKSSLLPI^PVHF 

HGSSl^MIALFPKSKIYQAFYSEVFSLWKQQDN 

SGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPA 

GEKJRSSLKLLSAl<LPELDWFLQrIFAISSISQEPVM 

RT^PVLLQQAErNTTEiRIESDKVIISIVTGLPGCH 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=JLysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Gtutamine, R=Arginine, S=Serine, 
T=Threonine, y=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





ASELCAFLVTLHKECGRWMVYRQIMDSSECFHA 

AHFQRYLSSALEAQQNRSARQSAYIRKKTRLLV 

VLQGYTDVIDVVQALQTHPDSNVKASFTIGAITA 

CVEPMSCYMEHRFLFPKCLDQCSQGLVSNVVFT 

SHTTEQRHPLLVQLQSLIRAANPAAAFILAENGIV 

TRNEDIELILSENSFSSPEMLRSRYLMYPGWYEG 

KLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKA 

TOQQTi^PQPrrQnxTiVTJii nv\TWQr\^vn} r r\/rc\//^\rKi r r 
lls^oolJPkJrorroljrNl YHlLOJvVlvroL>ot,K 1 IVlbVL, YN 1 

LANSLSIMPVLEGPTPPPDSKSVSQDSSGQQECYL 
VF1GCSLKEDSIKDWLRQSAKQKPQRKALKTRG 
MLTQQEIRSIHVKRHLEPLPAGYFYNGTQFVNFF 
GDKTDFHPLMDQFMNDYVEEANREIEKYNQELE 

I HJJJLrxiL,Jvr 

3066 

A 

130 

588 

LAPLRCQPGTRTQPRSHPAANDPSAAMSAAGAR 

GLRATYHRLLDKVELMLPEICLRPLYNHPAGPRT 

VFFWAPIMKWGLVCAGLADMARPAEKLSTAQS 

AVLMATGFIWSRYSLVIIPKNWSLFAVNFFVGAA 

GASQLFR1WRYNQELKAKAHK 

3067 

A 

2 

1016 

EFARRRVFIAAREMSLLRSLRVFLVARTGSYPAG 

SLLRQSPQPRHTFYAGPRLSASASSKELLMKLRR 

KTGYSFVNCKKALETCGGDLKQAEIWLHKEAQ 

KEGWSKAAKLQGRKTKEGLIGLLQEGNTTVLVE 

VNCETDFVSRNLKFQLLVQQVALGTMMHCQTL 

KDQPSAYSKGFLNSSELSGLPAGPDREGSLKDQL 

ALAIGKLGENMILKRAAWVKVPSGFYVGSYVHG 

AMQSPSLHKLVLGKYGALVICETSEQKTNLEDV 

GRRLGQHWGMAPLSVGSLDDEPGGEAETKML 

SQPYLLDPSITLGQYVQPQGVSVVDFVRFECGEG 

EEAAETE 

3068 

A 

3 

1679 

NSRVWGPWTEPSAGSLRPMARKQNRNSKELGL 

VPLTDDTSHAGPPGPGRALLECDHLRSGVPGGR 

RRKDWSCSLLVASLAGAFGSSFLYGYNLSVVNA 

PTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 

SIFAIGGLVGTLIVKMIGKVLGRKHTLLANNGFAI 

SAALLMACSLQAGAFEMLIVGRFIMGIDGGVALS 

VLPMYLSEISPKEIRGSLGQVTAIFICIGVFTGQLL 

GLPELLGKESTWPYLFGVIWPAWQLLSLPFLP 

DSPRYLLLEKHNEARAVKAFQTFLGKADVSQEV 

EEVLAESRVQRSIRLVSVLELLRAPYVRWQVVT 

VIVTMACYQLCGLNAIWFYTNSIFGKAGIPPAKIP 

LMGLFFGTLTITLTLQDHAPWVPYLSIVGILAIIAS 

FCSGPGGlPFILTGEFFQQSQRPAAFnAGTVNWLS 

NFAVGLLFPFIQKSLDTYCFLVFATICITGAIYLYF 

VLPETKNRTYAEISQAFSKRNKAYPPEEK1DSAV 

TDGKINGRP 


A 

OOJ 

"2 Art 

AAuA V V bAMPKAKOK 1 KKv£tvr Cj YbVNRKKLNK 

NARRKAAPRIECSHIRHAWDHAKSVRQNLAEMG 

LAVDPNRAVPLRKRKVKAMEVDIEERPKELVRK 

PYVLNDLEAFAST PFKKGNTLSRDI 1DYVRYMV 

ENHGEDYKAIVIAI\DEKNYYQ 

KRFYPAEWQDFLDSLQKRKMEVE 

3070 

A 

325 

2019 

LAEPEVATDSGQQADLPAEGGDPRAEASCSVLH 
SKPRAMADSRDPASDQMQ1TVVKEQRAAQKADV 
LTTGAGNPVGDKiNVITVGPRGPLLVQDVVFTD 
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SEQ n> 

NO: 

Method 

Predicted 

beginning 

nucieotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alaninc C^Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glyctne, H=Histidine, 
t=2soieucine, K=Lysine, L-^Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, A=possible nucleotide deletion, 
^possible nucleotide insertion 





EMAHFDRERIPERVVHAKGAGAFGYFEVTHDIT 

KYSKAKVFEHIGKKTPIAVRFSTVAGESGSADTV 

RDPRGFAVKFYTEDGNWDLVGNNTPIFFIRDPILF 

PSFIHSQKRNPQTHLKDPDMVWDFWSLRPESLH 

QVSrXFSDRGIPDGHRHMNGYGSHTFKLVNANG 

EAVYCKFHYKTDQGIKNLSVEDAARLSQEDPDY 

GIRDLFNAIATGKYPSWTFYIQVMTFNQAETFPF 

NPFDLTKVWPHKDYPLIPVGKLVLNRNPVNYFA 

EVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDT 

HRHRLGPN YLH1P VNCP Y RAK V AN Y QKJJUPMC 

MQDNQGGAPNYYPNSFGAPEQQPSALEHSIQYS 

GEVRRFNTANDDNVTQVRAFYVNVLNEEQRKR 

LCENIAGHLKDAQIFIQKKAVKNFTEVHPDYGSH 

IQALLDKYNAEKPKNAIHTFVQSGSHLAAREKA 

NL 

3071 

A 

1 

1187 

SLGWLERPPALSRAAGDGARRLSGSRRGDVWLT 

SSAAGLLRSVAGGSWCGGQLRARGGSGRCVAR 

AMTGNAGEWCLMESDPGVFTELIKGFGCRGAQ 

VEEIWSLEPENFEKLKPVHGLIFLFKWQPGEEPA 

GSVVQDSRLDTIFFAKQVINNACATQA1VSVLLN 

CTHQDVHLGETLSEFKEFSQSFDAAMKGLALSN 

SDVIRQVrlNSFARQQMFEFDTKTSAKEEDAFHF 

VSYVPVNGRLYELDGLREGPIDLGACNQDDWIS 

AVRPVIEKlUQKYSEGEIRl^LMAIVSDRKMrYEQ 

KIAELQRQLAEEEPMDTDQGNSMLSAIQSEVAK 

NQNILreEEVQKLKRYKIENIRRKHNYLPFIMELL 

KTLAEHQQL1PLVEKAKEKQNAKKAQETK 

3072 

A 

103 

2775 

RLRTLAPPGLLLGPPLVPDSRRRHQASLTPLHISG 

SPQLVGRGDRKLRTEVLVPPAALPAETRQRRSER 

LPRRTCPRGGAPGPGRSRLPRSLPPPSAIPGLRSPV 

WAAGLGGGGRilEPSRGKGGAALRARHRSTMAE 

LGAGGDGHRGGDGAVRSETAPDSYKVQDKKNA 

SSRPASAISGQNNNHSGNKPDPPPVLRVDDRQRL 

ARERREEREKQLAAREIVWLEREERARQHYEKH 

LEERKKl^EEQRQKEERRRAAVEEKRRQRLEED 

KERHEAVVRRTMERSQKPKQKHNRWSWGGSLH 

GSPSIHSADPDRRSVSTMNLSKYVDPVISKRLSSS 

SATLLNSPDRARRLQLSPWESSVVNRLLTPTHSF 

LARSKSTAALSGEAVIPICPRSASCSPDMPYKAAH 

S1WSMDRPKLFVTPPEGSSRRRIIHGTASYKKJERE 

RENVLFLTSGTRRAVSPSNPKARQPARSRLWLPS 

KSLPHLPGTPRPTSSLPPGSVKAAPAQVRPPSPGN 

HIPVKREVKVEPEKKDPEKEPQKVANEPSLKGRA 

PLVKVEEATVEERTPAEPEVGPAAPAMAPAPAS 

APAPASAPAPAPVPTPAMVSAPSSTVNASASVKT 

SAGTTDPEEATRLLAEKRRLAREQREKEERERRE 

QEELERQKREELAQRVAEERTTRREEESRRLEAE 

QAREKEEQLQRQAEERALREWEEAERAQRQKEE 

EARXHiUiEAERVRQEl^KHFQREEQEI^ERK^ 

TTETA/n^PTRPTE ATFlk r l< r TCr>OP"NmrJTAK'riAI Tdd 
JCrEliVllVlx 1 IVIV 1 £/\ 1 JJISJn. 1 C!)1-/V^1\JN \JUIJ\I>l.\J/\1j I OvJ 

TEVSALPCTTNAPGNGKPVGSPHVVTSHQSKVT 
VESTPDLEKQPNENGVSVQNEOTEEIINLPIGSKP 
SRLDVTNSESPEIPLNPILAFDDEGTLGPLPQVDG 
VQTQQTAEV1 

3073 

A 

67 

2415 

PPRVCl^HVCLICWDPIAGTGGSRSTMPALPLDQ 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Iso!eucine, K=Lysine, L= Leu cine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, \V=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





LQITHKDPKTGKLRTSPALHPEQKADRYFVLYKP 

PPKDNIPALVEEYLERATFVANDLDWLLALPHD 

KFWCQVIFDETLQKCLDSYLRYVPRKFDEGVAS 

APEVVDMQKRLHRSVFLTFLRMSTHKESKDHFIS 

PSAFGEILYNNFLFDIPKILDLCVLFGKGNSPLLQ 

KMIGNIFTQQPSYYSDLDETLPTILQVFSNILQHC 

GLQGDGANTTPQKLEERGRLTPSDMPLLELKDIV 

LYLCDTCTTLWAFLDIFPLACQTFQKHDFCYRLA 

SFYEAAIPEMESAIKKRRLEDSKLLGDLWQRLSH 

SRKKLMEIFHIILNQICLLPILESSCDNIQGFIEEFL 

QIFSSLLQEKRFLRDYDALFPVAEDISLLQQASSV 

LDETRTAYILQAVESAWEGVDRRKATDAKDPSV 

IEEPNGEPNGVTVTAEAVSQASSHPENSEEEECM 

GAAAAVGPAMCGVELDSLISQVKDLLPDLGEGFI 

LACLEYYHYDPEQVINNILEERLAPTLSQLDRNL 

DREMKPDPTPLLTSRHNVFQNDEFDVFSRDSVDL 

SRVHKGKSTRKEENTRSLLNDKRAVAAQRQRYE 

QYSVVVEEVPLQPGESLPYHSVYYEDEYDDTYD 

GNQVGANDADSDDELISRRPFTIPQVLRTKVPRE 

GQEEDDDDEEDDADEEAPKPDHFVQDPA VLREK 

AEARRMAFLAKKGYRHDSSTAVAGSPRGHGQS 

RETTQERRKKEANKATRANHNRRTMADRKRSK 

GMIPS 

3074 

A 

3 

251 

GEARSPPPAAALLDMDPETCPCPSGGSCTCADSC 
KCEGCKCTSCKKSCCSCCPAECEKCAKDCVCKG 
GEAAEAEAEKCSCCQ 

3075 

A 

255 

982 

SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WEAKKARLEWELKEEEKKKECAARGEDYEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQDCPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 

3076 

A 

255 

982 

SQFSLSQVLVDSAEEGSLAAAAELAAQKREQRL 

RKFRELHLMRNEARKLNHQEVVEEDKRLKLPAN 

WE AKKARLE WELKEEEKKKEC A ARG EDYEKVK 

LLEISAEDAERWERKKKRKNPDLGFSDYAAAQL 

RQYHRLTKQIKPDMETYERLREKHGEEFFPTSNS 

LLHGTHVPSTEEIDRMVIDLEKQIEKRDKYSRRR 

PYNDDADIDYINERNAKFNKKAERFYGKYTAEI 

KQNLERGTAV 

3077 

A 

1 

968 

FRLRPRRACAQLLWHPAAGMASWAKGRSYLAP 

GLLQGQVAIVTGGATGIGKAIVKELLELGSNVVI 

ASRKLERLKSAADELQANLPPTKQARV1PIQCN1R 

NEEEX^^VKSTLDTFGKINFLVNNGGGQFLSPA 

EHISSKGWHAVLETNLTGTFYMCKAVYSSWMK 

KHGGSIV7OTVPTKAGFPLAVHSGAARAGVYNLT 

KSLAFEWACSGIRINCVAPGVIYSQTAVENYGSW 

GQSFFEG SFQKIPAKRIG VPEEVSS VVCFLLSPAA 

SFITGQSVDVDGGRSLYTHSYEVPDHDNWPKGA 

GDLSVVKKMKETFKEKAKL 

3078 

A 

2 

3508 

FVRESGKAPVTFDDITVYLLQEEWVLLSQQQKEL 

CGSNKLVAPLGPTVANPELFRKFGRGPEPWLGS 

VQGQRSLLEHHPGKKQMGYMGEMEVQGPTRES 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
Wsoleucine, K=Lysine, L=Leucine, M=Mefhionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 





GQSLPPQKJKAYLSHLSTGSGrflEGDWAGRNRKL 

LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREY 

PSIRDKRSRLIEGYTGPFKVETLKYHAKSKAHMF 

CVNALAARDPIWAARFRSIRDPPGDVLASPEPLF 

TADCPIFYPPGPLGGFDSMAELLPSSRAELEDPGG 

DGAIPAMYLDCISDLRQKEITDGIHSSSDIN1LYN 

DAVESCIQDPSAEGLSEEVPVVFEELPVVFEDVA 

VYFTREEWGMLDKRQKELYRDVMRMNYELLAS 

LGPAAAKPDL1SKLERRAAPWIKDPNGPKWGKG 

RPPGNKKMVAVREADTQASAADSALLPGSPVEA 

RASCCSSSICEEGDGPRRIKRTYRPRSIQRSWFGQ 

FPWLVIDPKETKLFCSACIERPNLHDKSSRLVRG 

YTGPFKVETLKYHEVSKAHRLCVNTVEIKEDTPH 

TALVPEISSDLMANMEHFFNAAYSIAYHSRPLND 

FEKJLQLLQSTGTVILGKYRNRTACTQFIKYISETL 

KREILEDVRNSPCVSVLLDSSTDASEQACVGIYIR 

YFKQMEVKESYITLAPLYSETADGYFETIVSALD 

ELDIPFRKPGWVVGLGTDGSAMLSCRGGLVEKF 

QEVIPQLLPVHCVAHRLHLAVVDACGSIDLVKK 

CDRHIRTVFKFYQSSNKRLNELQEGAAPLEQEIIR 

LKDLNAVRWVASRRRTLHALLVSWPALARHLQ 

RVAEAGGQIGHRAKGMLKLMRGFHFVKFCHFL 

LDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVAL 

ESLRHQAGPKEEEFNASFKDGRLHGICLDKLEVA 

EQRFQADRERTVLTGIEYLQQRFDADRPPQLKN 

MEVFDTMAWPSGIELASFGNDDILNLARYFECSL 

r I U I o£Zi£j^\.JL>JUE/IZ> W JL/VJL#rs>. 1 1/A.V^riJ_/r r 0 1 Yl 1^ v^IVlN /A. JLj /A. 

QHCRFPLLSKLMAVVVCVPISTSCCERGFKAMN 

RIRTDERTKLSNEVLNMLMMTAVNGVAVTEYD 

PQPAIQHWYLTSSGRRFSHVYTCAQVPARSPASA 

RLRKEEMGALYVEEPRTQKPPILPSREAAEVLKB 

CIMEPPERLLYPHTSQEAPGMS 

3079 

A 

343 

1513 

FSPLEPRLCSLGGWGALQAGEPCQPSRAGCGRE 

GATMGCTLSAEERAALERSKAIEKNLKEDGISAA 

KDVKJLLLLGAGESGKSTIVKQMKIIHEDGFSGED 

VKQYKPVVYSNTIQSLAAIVRAMDTLGIEYGDK 

ERKADAKMVCDVVSRMEDTEPFSAELLSAMMR 

LWGDSGIQECFNRSREYQLNDSAKYYLDSLDRIG 

A A DYOPTEODILRTRVKTTGIVETHFTFKNLHFR 

AAL/ A A A 1 /I 1 Jl\ A AV V IV A A VJ1 » M—i A AAA A A iulJUl A A A V 

LFDVGGQRSERKKWIHCFEDVTAIIFCVALSGYD 

QVLHEDETTNRMHESLKLFD^ 

LFLNKJKDEFEEKIKKSPLTICFPEYTGPSAFTEAVA 

YIQAQYESaO^KSAHKEIYSHVTCATDTNNIQFVF 

DAVTDVIIAKNLRGCGLY 

3080 

A 

41 

997 

EARTARELTDGVTDGLTN1ADQPKPISPLKNLLA 
GGFGGVCLVFVGHPLDTVKVRLQTQPPSLPGQPP 
MYSGTFDCFRKTLFREGITGLYRGMAAPIIGVTP 
MFAVCFFGFGLGK10.QQKHPEDVLSYPQLFAAG 

IVl_i_/0 vJ Vf 11 VJ11V1 a a vJa-iAVIJVV^JOJ^V^aV^^OOVJAjOJV a 1 vl 1 

LDCAKKLYQEFGlllGlYKGTXn^TLMRDVPASGM 
YFMTYEWLKNIFTPEGKRVSELSAPRILVAGGIA 
GIFNWAVAIPPDVLKSRFQTAPPGKYPNGFRDVL 
RELIRDEGVTSLYKGFNAVMIRAFPANAACFLGF 
EVAMKFLNWATPNL 

3081 

A 

3 

1996 

IMADMEDLFGSDADSEAERKDSDSGSDSDSDQE 
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SEQID 

NO: 

(Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





NAASGSNASGSESDQDERGDSGQPSNKELFGDD 

SEDEGASHHSGSDNHSERSDNRSEASERSDHEDN 

DPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSEA 

EGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDE 

ERAQGSDEDKLQNSDDDEKMQNTDDEERPQLS 

DDERQQLSEEEKANSDDERPVASDNDDEKQNSD 

DEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSESARGSDSEDEVLRMKRKNAIASDSE 

ADSDTEVPKDNSGTMDLFGGADD1SSGSDGEDK 

PPTPGQPVDENGLPQDQQEEEPIPETR1EVEIPKV 

NTDLGNDLYFVKLPNFLSVEPRPFDPQYYEDEFE 

DEEMLDEEGRTRLKLKVENTIRWR1RRDEEGNEI 

KESNAR1VKWSDGSMSLHLGNEVFDYYKAPLQG 

DHNHLFIROGTGLOGOAVFKTKLTFRPHSTnSAT 

HRKMTLSLADRCSKTQKIRILPMAGRDPECQRTE 

MIKKEEERLRASIRRESQQRRMREKQHQRGLSAS 

YLEPDRYDEEEEGEESISLAAIKNRYKGGIREERA 

RIYSSDSDEGSEEDKAQRLLKAKKLTSDEVRPNL 

FNSRGLSCTQEPTALNEELTDQAGTN 

3082 

A 

3 

921 

VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 
KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 
GASVILRDIARAREN1QKSLAGSSGPGASSGTSGD 
HGELVVRIASLEVENQSLRGVVQELQQAISKLEA 
RI NVI FKSSPCiHR ATAPOTOHV^PK/rRfWPPPAlf 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKKAKKPALVAKSSILLDVKPWDDETD 

MAQLEACVRS1QLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 

3083 

A 



3 

921 

VEFCLPASADSSSLVAASLAGVRKMATNFLAHE 

KIWFDKFKYDDAERRFYEQMNGPVAGASRQEN 

GASVILRDIARARENIQKSLAGSSGPGASSGTSGD 

HGELVVRIASLEVENQSLRGVVQELQQAISKLEA 

RLNVLEKSSPGHRATAPOTOHVSPMROVFPPAK 

KPATPAEDDEDDDIDLFGSDNEEEDKEAAQLREE 

RLRQYAEKKAKiCPALVAKSSILLDVKPWDDETD 

MAQLEACVRSIQLDGLVWGASKLVPVGYGIRKL 

QIQCVVEDDKVGTDLLEEEITKFEEHVQSVDIAA 

FNKI 

3084 

A 

128 

4050 

KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNK1PS 

DLLDKEFLPILQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTWGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRXDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRJRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFR? 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLELPPPVPNPSPTLRP 

VETPWGAPGMGSVSTEPPDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cystetne, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
i-Isoieucine, K=Lysine, L=Leucine, M=Methiouine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, \V=Tryptophan, Y=Tyrosinc, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRISDQNIIPSVTRSVSVPDTGSIWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKA KA A KLEQERREAEMRA KREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

EEERRKQEELLRKQEEEAAKWAREEEEAQRRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSG WGN V 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 

PNRARNNTHSNLHTSIGNSVWGSINTGPPNQWA 

SDLVSSIWSNADTKNSNMGFWDDAVKEVGPKN 

STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 

LFQG VNKAQDGFTQ WCbQMLriAJLJN 1 AN N Lu V r 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 

3085 

A 

128 

4050 

KSIVKIRKRMAAETQTLNFGPEWLRALSSGGSITS 

PPLSPALPKYKLADYRYGREEMLALFLKDNKIPS 

DLLDKEFLP1LQEEPLPPLALVPFTEEEQRNFSMS 

VNSAAVLRLTGRGGGGTVVGAPRGRSSSRGRGR 

GRGECGFYQRSFDEVEGVFGRGGGREMHRSQS 

WEERGDRRFEKPGRKDVGRPNFEEGGPTSVGRK 

HEFIRSESENWRIFREEQNGEDEDGGWRLAGSRR 

DGERWRPHSPDGPRSAGWREHMERRRRFEFDFR 

DRDDERGYRRVRSGSGSIDDDRDSLPEWCLEDA 

EEEMGTFDSSGAFLSLKKVQKEPIPEEQEMDFRP 

VDEGEECSDSEGSHNEEAKEPDKTNKKEGEKTD 

RVGVEASEETPQTSSSSARPGTPSDHQSQEASQFE 

RKDEPKTEQTEKAEEETRMENSLPAKVPSRGDE 

MVADVQQPLSQIPSDTASPLLILPPPVPNPSPTLRP 

VETPVVGAPGMGSVSTEPDDEEGLKHLEQQAEK 

MVAYLQDSALDDERLASKLQEHRAKGVSIPLMH 

EAMQKWYYKDPQGEIQGPFNNQEMAEWFQAG 

YFTMSLLVKRACDESFQPLGDIMKMWGRVPFSP 

GPAPPPHMGELDQERLTRQQELTALYQMQHLQY 

QQFLIQQQYAQVLAQQQKAALSSQQQQQLALLL 

QQFQTLKMRlSDQNIIPSVTRSVSVPDTGSrWELQ 

PTASQPTVWEGGSVWDLPLDTTTPGPALEQLQQ 

LEKAKAAKLEQERREAEMRAKREEEERKRQEEL 

RRRQKGILRRQQEEERKRREEEELARRKQEEALR 

RQREQEIALRRQREEEERQQQEEALRRLEERRRE 

FFFRRKOFFT T RKOFFFAAKWAREEEEAORRLE 

ENRLRMEEEAARLRHEEEERKRKELEVQRQKEL 

MRQRQQQQEALRRLQQQQQQQQLAQMKLPSSS 

TWGQQSNTTACQSQATLSLAEIQKLEEERERQLR 

EEQRRQQRELMKALQQQQQQQQQKLSGWGNV 

SKPSGTTKSLLEIQQEEARQMQKQQQQQQQHQQ 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoteucine, K=Lysine, LHLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib!e nucleotide deletion, 
V=possible nucleotide insertion 





PNRAIWNTHSNLHTSIGNSVWGSINTGPPNQWA 
SDLVSSIWSNADTKNSNMGFWDDAVKEVGPRN 
STNKNKNNASLSKSVGVSNRQNKKVEEEEKLLK 
LFOGVNKAODGFTOWrFOMT HA? NTANNI DVP 

TFVSFLKEVESPYEVHDYIRAYLGDTSEAKEFAK 

QFLERRAKQKANQQRQQQQLPQQQQQPPQQPP 

QQPQQQDSVWGMNHSTLHSVFQTNQSNNQQSN 

FEAVQSGKKKKKQKMVRADPSLLGFSVNASSER 

LNMGEIETLDDY 

3086 

A 

675 

1334 

LHPAATSTAWLHVPPGLSMALSWVLTVLSLLPL 
1 FAOTPI PAlSTT VPVPTTNATT OR ITGK WFYF A9AF 

RNEEYNKSVQEIQATFFYFTPNKTEDTIFLREYQT 

RQDQCIYNTTYLNVQRENGTISRYVGGQEHFAH 

LLILRDTKTYMLAFDVNDEKNWGLSVYADKPET 

TKEQLGEFYEALDCLRJPKSDVVYTDWKKDKCE 

PLEKQHEKERKQEEGES 

3087 

A 

1 

1575 

CTPVARSMATTATCTRFTDDYQLFEELGKGAFS 

WRRCVKKTSTQEYAAKIINTKKLSARDHQKLE 

REARICRLLKHPNIVRLHDSISEEGFHYLVFDLVT 

GGELFEDIVAREYYSEADASHCIHQILESVNHIHQ 

HDIVHRDLKPENLLLASKCKGAAVKLADFGLAIE 

VQGEQQAWFGFAGTPGYLSPEVLRKDPYGKPVD 

IWACGVILYILLVGYPPFWDEDQHKLYQQIKAG 

AYDFPSPEWDTVTPEAKNLINQMLTINPAKRITA 

DQALKHPWVCQRSTVASMMHRQETVECLRKPN 

ARRKLKGAILTTMLVSRNFSAAKSLLNKKSDGG 

Vl^PO^TsTNU^M^T V^PAOPPAPT OTA MPPOTTA/VT-T 

NATDGIKGSTESCNTTTEDEDLKVRKQEIIKITEQ 

LIEAINNGDFEAYTKICDPGLTSFEPEALGNLVEG 

MDFHKFYFENLLSKNSKPIHTTILNPHVHVIGED 

AACIAYIRLTQYIDGQGRPRTSQSEETRVWHRRD 

GKWLNVHYHCSGAPAAPLQ 

3088 

A 

12 

1039 

SSVAEFPERVQLSQPQNWNFSGAGGAWSLDFAE 

QLKWSAELARLGESIMDGKQGGMDGSKPAGPR 

DFPGIRLLSNPLMGDAVSDWSPMHEAAIHGHQL 

SLRNLISQGWAVNnTADHVSPLHEACLGGHLSC 

VKILLKHGAQVNGVTADWHTPLFNACVSGSWD 

CVNT T T.OHGASVOPF^ni A^PTHFA ARRGHVFP 

VNSLIAYGGNIDHKISHLGTPLYLACENQQRACV 

KKLLESGADVNQGKGQDSPLHAVARTASEELAC 

LLMDFGADTQAKNAEGKRPVELVPPESPLAQLF 

LEREGPPSLMOLCRLRIRKCFGIOOHHKITKLVLP 

EDLKQFLLHL 

3089 

A 

73 ; 

432 

DMAGLMTIVTSLLFLGVCAHHIIPTGSVVLPSPCC 
MFFVSKllIPENRVVSYQLSSRSTCLKAGVIFri^ 
GQQFCGDPKQEWVQRYMKNLDAKQKKASPRA 
RAVAVKGPVQRYPGNQTTC 

3090 

A 

4627 

611 

LMEAGGGGGALPAGVETMVLTLGESWPVLVGR 

RFLSLSAADGSDGSHDSWDVERVAEWPWLSGTI 

l^VSHTDVTKXDLKVCVEFDGESWRKRRWTEV 

YSLLRRAFLVEHNLVLAERKSPEISERIVQWPAIT 

YKPLLDKAGLGSITSVRFLGDQQRVFLSKDLLKP 

IQDVNSLRLSLTDNQIVSKEFQALIVKHLDESHLL 

KGDKmVGSEVKIYSLDPSTQWFSATVVNGNPA 

SKTLQVNCEEIPALKIVDPSLIHVEVVHDNLVTC 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Clycine, H=Histtdine, 
I=lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T— Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





GNSARIGAVKRKSSENNGTLVSKQAKSCSEASPS 

MCPVQSVPTTVFKEILLGCTAATPPSKDPRQQST 

PQAANSPPNLGAKIPQGCHKQSLPEEISSCLNTKS 

EALRTKPDVCKAGLLSKSSQIGTGDLKILTEPKGS 

CTQPKTNTDQENRLESVPQALTGLPKECLPTKAS 

SKAELEIANPPELQKHLEHAPSPSDVSNAPEVKA 

GVNSDSPNNCSGKKVEPSALACRSQNLKESSVK 

VDNESCCSRSNNKIQNAPSRKSVLTDPAKLKKLQ 

QSGEAFVQDDSCVNIVAQLPKCRECRLDSLRKD 

KEQQKDSPVFCRFFHFRRLQFNKHGVLRVEGFLT 

PNKYDNEAIGLWLPLTKNVVGIDLDTAKYILANI 

GDHFCQMVISEKEAMSTIEPHRQVAWKRAVKG 

VREMCDVCDTTIFNLHWVCPRCGFGVCVDCYR 

MKRKNCQQGAAYKTFSWLKCVKSQIHEPENLM 

PTQIIPGKALYDVGDIVHSVRAKWGIKANCPCSN 

RQFKLFSKPASKEDLKQTSLAGEKPTLGAVLQQ 

NPSVLEPAAVGGEAASKPAGSMKPACPASTSPLN 

WLADLTSGNVNKENKEKQPTMPILKNEIKCLPPL 

PPLSKSSTVLHTFNSTILTPVSNNNSGFLRNLLNSS 

TGKTENGLKNTPK1LDDIFASLVQNKTTSDLSKR 

PQGLTIKPSILGFDTPHYWLCDNRLLCLQDPNNK 

SNWNVFRECWKQGQPVMVSGVHHKLNSELWK 

PESFRKEFGEQEVDLVNCRTNEIITGATVGDFWD 

GFEDVPNRLKNEKEPMVLKLKDWPPGEDFRDM 

MPSRFDDLMANEPLPEYTRRDGKLNLASRLPNYF 

VRPDLGPKMYNAYGLITPEDRKYGTTNLHLDVS 

LJA A.1N ViVl V I V OIrtVOl^OJDv^J&E.il V L>]V 1 IV^JL/VJUoJL/J& 

LTIKRFIEGKEKPGALWHIYAAKDTEKIREFLKK 
VSEEQGQENPADHDPIHDQSWYLDRSLRKRLHQ 
EYGVQGWAIVQFLGDVVFIPAGAPHQVHNLYSC 
IKVAEDFVSPEHVKHCFWLTQEFRYLSQTHTNHE 
DKLQ VKNVI YHA VKDA V AMLKASES SFGKP 

3091 

A 

97 

1838 

KRGARRGGWKRKMPSTDLLMLKAFEPYLEILEV 

YSTKAKNYVNGHCTKYEPWQLIAWSVVWTLLI 

VWGYEFVFQPESLWSRFKKKCFKLTRKMPIIGRK 

IQDKLNKTKDDISKNMSFLKVDKEYVKALPSQG 

LSSSAVLEKLKEYSSMDAFWQEGRASGTVYSGE 

EKLTELLVKAYGDFAWSNPLHPDIFPGLRKIEAEI 

VR1ACSLFNGGPDSCGCVTSGGTESILMACKAYR 

DLAFEKGIKTPE1VAPQSAHAAFNKAASYFGMKI 

VRVPLTKMME VD VRAMRRAISRNTAML VC STP 

QFPHGVIDPVPEVAKLAVKYKIPLHVDACLGGFL 

IVFMEKAGYPLEHPFDFRVKGVTSISADTHKYGY 

APKGSSLVLYSDKKYRNYQFFVDTDWQGGIYAS 

PTT AH^RPnnf^A ArWA AT MWFfiFlsJCiYVFATKOT 
X 1 l/\.vJOIVr UVJlO/iAU W /\/\JL»lVJ_rLT VJIMNvJ I V Etfx 1 Ivv^l 

IKTARFLKSELENIKGIFVFGNPQLSVIALGSRDFD 

IYRLSM.MTAKGWNLNQLQFPPSIHFCITLLHAR 

KRVAIQFLKDIRESVTQIMKNPKAKTTGMGAIYG 

MAQTTVDRNMGAELSSVFLDSLYSTDTVTQGSQ 

MNGSPKPH 

3092 

A 

79 

2652 

LCSQNSPEDWVNFSSEKQKRYPWYWTGRKLRSE 

RAMKIQKKLTGCSRLMLLCLSLELLLEAGAGNIH 

YSVPEETDKGSFVGNIAKDLGLQPQELADGGVR1 

VSRGRMPLFALNPRSGSLITARR1DREELCAQSM 

PCLVSFNILVEDKMKLFPVEVEIIDINDNTPQFQL 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Clycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threontne, V=Valine, \V=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-posstble nucleotide insertion 





EELEFKMNEITTPGTRVSLPFGQDLDVGMNSLQS 

YQLSSNPHFSLDVQQGADGPQHPEMVLQSPLDR 

EEEAVHHLILTASDGGEPVRSGTLRIYIQVVDAN 

DNPPAFTQAQYHINVPENVPLGTQLLMVNATDP 

DEGANGEVTYSFHNVDHRVAQIFRLDSYTGEISN 

KEPLDFEEYKMYSMEVQAQDGAGLMAKVKVLI 

KVLDVNDNAPEVT[TSVTTAVPENFPPGTIIALISV 

HDQDSGDNGYTTCFIPGNLPFKLEKLVDNYYRL 

VTERTLDRELISGYNITITAIDQGTPALSTETHISL 

LVTDINDNSPVFHQDSYSAYIPENNPRGASIFSVR 

AHDLDSNENAQITYSLIEDTIQGAPLSAYLSINSD 

TG VL Y ALRSFD YEQFRDMQL K VMARDSGDPPLS 

SNVSLSLFLLDQNDNAPEILYPALPTDGSTGVEL 

APRSAEPGYLVTKVVAVDRDSGQNAWLSYRLL 

KASEPGLFSVGLHTGEVRTARALLDRDALKQSL 

VVAVODHGOPPT SATVTT TV A VADRfPDIT ADI H 

SLEPSAKPNDSDLTLYLVVAEAAVSCVFLAFVIV 
LLAHRLRRWHKSRLLQASGGGLASTPGSHFVGV 
DGVRAFLQTYSHEVSLTADSRKSHL1FPQPNYAD 
TLISQESCEKKGFLSAPQSLLEDKKEPFSQVNFCD 
ECISYLEKNNS 

3093 

A 

1 

3868 

PPDNQKLGLLEALLKIGDWQHAQNIMDQMPPYY 

AASHKLIALAICKLIHITIEPLYRSVTSWAVDHAG 

FLESDPCDSTVGHLLSRVGVPKGAKGSPVNALQ 

NKRAPKQAESFEDLRRDVFNMFCYLGPHLSHDPI 

LFAKVVRIGKSFMKEFQSDGSKQEDKEKTEVILS 

CLLSITDQVLLPSLSLMDCNACMSEELWGMFKT 

FPYQHRYRLYGQWKNETYNSHPLLVKVKAQTID 

RAKYIMKRLTKENVKPSGRQIGKLSHSNPTELFD 

YVCFEILSQIQKYDNLITPVVDSLKYLTSLNYDVL 

ACILSNCirEALANPEKERMKHDDTTISSWLQSLA 

SFCGAVFRKYPIDLAGLLQYVANQLKAGKSFDL 

LILKEVVQKMAGIEITEEMTMEQLEAMTGGEQL 

KAEGGYFGQIR>ITKKSSQRLKDALLDHDLALPL 

CLLMAQQRNGVIFQEGGEKHLKLVGKLYDQCH 

DTLVQFGGFLASNLSTEDYIKRVPSIDVLCNEFHT 

PHDAAFFLSRPMYAHHISSKYDELKKSEKGSKQ 

QHKVHKYITSCEN1VMAPVHEAVVSLHVSKVWD 

DISPQFYATFWSLTMYDLAVPHTSYEREVNKLK 

VQMKAIDDNQEMPPNKKKKEKERCTALQDKLL 

EEEKKQMEHVQRVLQRLKLEKDNWLLAKSTKN 

ETITKFLQLCIFPRCIFSAIDAVYCARFVELVHQQ 

KTPNFSTLLCYDRVFSDIIYTVASCTENEASRYGR 

FLCCMLETVTRWHSDRATYEKECGNYPGFLTIL 

RATGFDGGNKADQLDYENFRHWHKWHYKLT 

KASVHCLETGEYTHIRNILIVLTKILPWYPKVLNL 

GQALERRVHKICQEEKEKRPDLYALAMGYSGQL 

KSRKSYMIPENEFHHKDPPPRNAVASVQNGPGG 

GPSSSSIGSASKSDESSTEETDKSRERSQCGVKAV 

NKASSTTPKGNSSNGNSGSNSNKAVKENDKEKG 

KEKEKEKKEKTPATTPEARVLGKDGKEKPKEER 

PNKDEKARETKERTPKSDKEKEKFKKEEKAKDE 

KFKTTVPNAESKSTQEREREKEPSRERDIAKEMK 

SKENVKGGEKTPVSGSLKSPVPRSDIPEPEREQKR 

RKIDTHPSPSHSSTVKDSLIELKESSAKLYINHTPP 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon t /=possible nucleotide deletion, 
\=possible nucleotide insertion 





PLSKSKEREMDKKDLDKSRERSREREKKDEKDR 

KERKRDHSNNDREVPPDLTKRRKEENGTMGVSK 

HKSESPCESPYPNEKDKEKNKSKSSGKEKGSDSF 

KSEKMDKISSGGKKESRHDKEKIEKKEKRDSSGG 

KEEKKHHKSSDKHR 

3094 

A 

2 

891 

AMLGTREPSRRGAGAVQAEVSERLAMAGPQQQ 
PPYLHLAELTASQFLEIWKHFDADGNGYIEGKEL 
ENFFQELEKARKGSGMMSKSDNFGEKMKEFMQ 

vvPivivTcr^riVTPN/r apt aott ptppwfi t PPRowvri 
iv i l^lviNoJL^OJS.llllviAiiLAv^lJUr i &E.rir L>L>\^r fVv^ri v vj 

SSAEFMEAWRKYDTDRSGYIEANELKGFLSDLL 
KKANRPYDEPKLQEYTQTILRMFDLNGDGKLGL 
SEMSRLLPVQENFLLKFQGMKLTSEEFNA1FTFY 
DKDRSGYIDEHELDALLKDLYEKNKJCEMNIQQL 
TNYRKSVMSLAEAGKLYRKDLEIVLCSEPPM 

3095 

A 

1685 

700 

RRPTGRPGALGAPAAGRVGMPLHVfCWPFPAVPP 

LTWTLASSVVMGLVGTYSCFWTKYMNHLTVHN 

REVLYELIEKRGPATPLITVSNHQSCMDDPHLWG 

ILKLRHIWNLKLMRWTPAAADICFTKELHSHFFS 

IajKC V r V v^KU Afcr r AfcN JbuivU V JLU 1 OKxlJVlrO 

AGKRREKGDGVYQKGMDFILEKLNHGDWVHIF 

PEGKVNMSSEFLRFKWGIGRLIAECHLNPHLPLW 

HVGMNDVLPNSPPYFPRFGQK1TVLIGKPFSALP 

VLERLRAENKSAVEMRKALTDFIQEEFQHLKTQ 

AEQLHNHLQAWEIGLACCLLDSWPAQSWG 

3096 

A 

6642 

4022 

FVPGLREPQWEPAQPSATMSAPSEEEEYARLVM 

EAQPEWLRAEVKRLSHELAETTREKIQAAEYGL 

AVLEEKHQLKLQFEELEVDYEAIRSEMEQLKEAF 

GQAHTNHKKVAADGESREESLIQESASKEQYYV 

RKVLELQTELKQLRNVLTNTQSENERLASVAQE 

LKEINQNVEIQRGRLRDDIKEYKFREARLLQDYS 

ELEEENISLQKQVSVLRQNQVEFEGLKHEIKRLE 

EETEYLNSQLEDAIRLKEISERQLEEALETLKTER 

EQKNSLRJKELSHYMSINDSFYTSHLHVSLDGLKF 

SDDAAEPNNDAEALVNGFEHGGLAKLPLDNKTS 

TPKKEGLAPPSPSLVSDLLSELNISE1QKLKQQLM 

QMEREKAGLLATLQDTQKQLEHTRGSLSEQQEK 

VTRLTENLSALRRLQASKERQTALDNEKDRDSH 

EDGDYYEVD1NGPEILACKYHVAVAEAGELREQ 

LKALRSTHEAREAQHAEEKGRYEAEGQALTEKV 

SLLEKASRQDRELLARLEKELKKVSDVAGETQG 

SLSVAQDELVTFSEELANLYHHVCMCNNETPNR 

VMLDYYREGQGGAGRTSPGGRTSPEARGRRSPI 

LLPKGLLAPEAGRADGGTGDSSPSPGSSLPSPLSD 

PRREPN4NIYNLIAIIRDQIKHLQAAVDRTTELSRQ 

RIASQELGPAVDKDKEALMEEELKLKSLLSTKRE 

1^11 1JLK1 V JLIS-AlNlvv^ 1 At V ALrAINL-lVoiS. I JllNJcJVAlVi 

VTETMMKLRNELKALKEDAATFSSLRAMFATRC 

DEYITQLDEMQRQLAAAEDEKKTLNSLLRMAIQ 

QKLALTQRLELLELDHEQTRRGRAKAAPKTKPA 

TPSVSHTCACASDRAFGTGLANOVFCSEKHSIYC 

D 

3097 

A 

1 

879 

MVKWPATRGNLPRSQLTGTHQHCQPREPKITA 
SERLRRRPRATARLRAHAAPPEPPLAVFAPPSDR 
KELLALPVACDPVIASVMSWVQAASLIQGPGDK 
GDVFDEEADESLLAQREWQSNMQRRVKEGYRD 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G!u (amine, R=Argtnine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X~Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





GIDAGKAVTLQQGFNQGYKKGAEVILNYGRLRG 
TLSALLSWCHLHNNNSTLINKINNLLDAVGQCEE 
YVLKHLKSITPPSHVVDLLDSIEDMDLCHVVPAE 
KKIDEAKDERLCENNAEFNKNCSKSHSGIDCSYV 
ECCRTQEHAHSGKPKPHMDFGTDSQF 

3098 

A 

2 

505 

GAATLLRSASSAARKAAEAEQVWLHLHRYLSA 

DRRVLGLREWGRPASERECSLCQRLKRELNMGD 

VEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGL 

FGRKTGQAPGYSYTAANKNKGIIWGEDTLMEYL 

ENPKKYIPGTKMIFVGIKKKEERADLIAYLKKAT 

NE 

3099 

A 

144 

1386 

WAVGQARSFPSHPRMSSWIWSRRWSPSVALRVT 

CTSTSSQRWTVLALSKPGSQQQVSMHTPAPGPPT 

AGHTEPPSEPPRRARVAKYRAKFDPRVTAKYDIK 

ALIGRGSFSRWRVEHRATRQPYAIKMIETKYRE 

GREVCESELRVLRRVRHANIIQLVEVFETQERVY 

MVMELATGGELFDR1IAKGSFTERDATRVLQMV 

LDGVRYLHALGITHRDLKPENLLYYHPGTDSKIII 

TDFGLASARXKGDDCLMKTTCG TPEYI APE VLV 

RKPYTNSVDMWALGVIAY1LLSGTMPFEDDNRT 

RLYRQILRGKYSYSGEPWPSVSNLAKDFIDRLLT 

VDPGARMTALQALRHPWVVSMAASSSMKNLHR 

SISQNLLKRASSRCQSTKSAQSTRSSRSTRSNKSR 

RVRERELREL 

3100 

A 

3 

1500 

ARWNGRWVQVPAWPGPGCGTNASGERQRQLPR 

AWRPVGRTLGSEPIALAWSPPLYLFPIPLPSWAVS 

QPTPTLGTMFADLDYDIEEDKLGIPTVPGKVTLQ 

KDAQNLIGISIGGGAQYCPCLYIVQVFDNTPAAL 

DGTVAAGDEITGVNGRSIKGKTKVEVAKMIQEV 

KGEVTIHYNKLQADPKQGMSLD1VLKKVKHRLV 

ENMSSGTADALGLSRAILCNDGLVKRLEELERTA 

ELYKGMTEHTKNLLRAFYELSQTHRGNGIPQSC 

AFGDVFSVIGVREPQPAASEAFVKFADAHRSIEK 

FGIRLLKTIKPMLTDLNTYLNKAIPDTRLTIKKYL 

DVKFEYLSYCLKVKEMDDEEYSCIALGEPLYRV 

STGNYEYRLILRCRQEARARFSQMRKDVLEKME 

LLDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLR 

DADVFPIEVDLAHTTLAYGLNQEEFTDGEEEEEE 

EDTAAGEPSRDTRGAAGPLDKGGSWCDS 

3101 

A 

1173 

197 

QGMDSKQQCVKLNDGHFMPVLGFGTYAPPEVP 

RSKALEVTKLAffiAGFRHIDSAHLYNNEEQVGLA 

IRSKIADGSVKREDIFYTSKLWSTFHRPELVRPAL 

ENSLKKAQLDYVDLYLIHSPMSLKPGEELSPTDE 

NGKVIFDIVDLCTTWEAMEKCKDAGLAKSIGVS 

NFNRRQLEMILNKPGLKYKPVCNQVECHPYFNR 

SKLLDFCKSKDIVLVAYSALGSQRDKRWVDPNS 

PVLLEDPVLCALAKKHKRTPALIALRYQLQRGV 

WLAKSYNEQRIRQNVQVFEFQLTAEDMKAIDG 

LDRNLHYFNSDSFASHPNYPYSDEY 

3102 

A 

144 

1098 

EQPRPPPCGRRPLPLGSAPCRVRLGRAPRQAPAM 

SMLPSFGFTQEQVACVCEVLQQGGNLERLGRFL 

WSLPACDHLHKNESVLKAKAVVAFHRGNFREL 

YKILESHQFSPHNHPKLQQLWLKAHYVEAEKLR 

GRPLGAVGKYRVRQKFPLPRTIWDGEETSYCFK 

EKSRGVLREWYAHNPYPSPREKRELAEATGLTT 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, 0=Glycine, H=Histidine t 
t=xSo!eucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptoph«n, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possib!e nucleotide insertion 





TQVSNWFKNRRQRDRAAEAKERENTENNNSSSN 
KQNQLSPLEGGKPLMSSSEEEFSPPQSPDQNSVLL 
LQGNMGHARSSNYSLPGLTASQPSHGLQTHQHQ 
LQDSLLGPLTSSLVDLGS 

3103 

A 

111 

1582 

LVYSWGCHIMADNDTDRNQTEKLLKRVRELEQ 

EVQRLKKEQAKNKEDSNIRENSSGAGKTKRAFD 

FSAHGRRHVALRJAYMGWGYQGFASQENTNNTI 

EEKLFEALTKTRLVESRQTSNYHRCGRTDKGVS 

AFGQVISLDLRSQFPRGRDSEDFNVKEEANAAAE 

EIRYTHILNRVLPPDIRILAWAPVEPSFSARFSCLE 

RTYRYFFPRADLDIVTMDYAAQKYVGTHDFRNL 

CKMDVANGVINFQRTILSAQVQLVGQSPGEGRW 

QEPFQLCQFEVTGQAFLYHQVRCMMAILFLIGQ 

GMEKPFHDFLLNIEKNPOKPOYSMAVFFPI VI Y 

DCKFENVKWIYDQEAQEFNITHLQQLWANHAV 

KTHMLYSMLQGLDTVPVPCGIGPKMDGMTEWG 

NVKPSVIKQTSAFVEGVKMRTYKPLIVxDRPKCQG 

LESRIQHFVRRGRIEHPHLFHEEETKAKRDCNDT 

LEEDNTNLETPTKRVCVDTEIKSII 

3104 

A 

227 

1519 

VTLIBCMNAMLETPELPAVFDGVKLAAVAAVLYV 

IVRCLNLKSPTAPPDLYFQDSGLSRFLLKSCPLLT 

KEYIPPLIWGKSGHIQTALYGKMGRVRSPHPYGH 

RKFITMSDGATSTFDLFEPLAEHCVGDDITMVICP 

GIANHSEKQY1RTFVDYAQKNGYRCAVLNHLGA 

LPNIELTSPRMFTYGCTWEFGAMVNYIKKTYPLT 

QLVVVGFSLGGNIVCKYLGETQANQEKVLCCVS 

V V^V^/U I 0/A.ljXVr\v^X-» XJTxvxV^ W X^V^^lVivr I INT L.lV.LrtL^iN 

MKKHLSHRQALFGDHVKKPQSLEDTDLSRLYTA 

TSLMQIDDNVMRKFHGYNSLKEYYEEESCMRYL 

HRIYWLMLVNAADDPLVHESLLTIPKSLSEKRE 

NVMFVLPLHGGHLGFFEGSVLFPEPLTWMDKLV 

VEYANAICQWERNKLQCSDTEQVEADLE 

3105 

A 

1 

1251 

MGLLLMILASAVLGSFLTLLAQFFLLYRRQPEPP 

ADEAARAGEGFRYIKPVPGLLLREYLYGGGRDE 

EPSGAAPEGGATPTAAPETPAPPTRETCYFLNATI 

LFLFl^LRDTALTRRWVTKKIKVEFEELLQTKTA 

GRLLEGLSLRDVFLGETVPFIKTIRLVRPVVPSAT 

G EPDGPEGEALPAACPEELAFEAE VEYNG GFHL A 

IDVDLVFGKSAYLFVKLSRVVGRLRLVFTRVPFT 

HWFFSFVEDPLIDFEVRSOFEGRPMPOLTSIIVNO 

LKXIIKEXHTLP>rYKl^l^FFPYQTLQGFEEDEE 

HIHIQQWALTEGRLKVTLLECSRLL1FGSYDREA 

NVHCTLELSSSVWEEKQRSSIKTGTISLTAVFMG 

WHRVSEAFPGLWYKLLVDLPFWGLEDGGPLLT 

VPLRQCPG 

3106 

A 

972 

468 

MAAAGAGRLRRVASALLLRSPRLPARFI SAPAR 

LYHKKVVDHYENPRNVGSLDKTSKNVGTGLVG 

APACGDVIV1XLQIQVDEKGKIVDARFKTFGCGSA 

IASSSLATEWVKGKTVEEALT1KNTDIAKELCLPP 

VKLHCSMLAEDAIKAALADYKLKQEPKKGEAE 

KK 

3107 

A 

106 

1221 

TCQDVRSVFSLVRANIFGEESTAGAGWHREEDM 
RKELQLSLSVTLLLVCGFLYQFTLKSSCLFCLPSF 
KSHQGLEALLSHRRGrVFLETSERMEPPHLVSCS 
VESAAKJYPEWPVVFFMKGLTDSTPMPSNSTYPA 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I~IsoIeucine, K=Lysine, L=Leucine, M=(Vlethionine, 
N=Asparagine, P=Proltne, Q=Gtutamine, R-Argintne, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *-Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





FSFLSAIDNVFLFPLDMKRLLEDTPLFSWYNQINA 

SAERNWLHISSDASRLAIIWKYGGIYMDTDV1S1R 

PIPEENFLAAQASRYSSNGIFGFLPHHPFLWECME 

NFVEHYNSAIWGNQGPELMTRMLRVWCKLEDF 

QEVSDLRCLNISFLHPQRFYPISYREWRRYYEVW 

DTEPSrTWSYALHLWNHMNQEGRAVIRGSNTLV 

ENLYRKHCPRTYRDLIKGPEGSVTGELGPGNK 

3108 

A 

1612 

839 

EVALFCFEMAAGMYLEHYLDSIENLPFELQRNFQ 

LMRDLDQRTEDLKAE1DKLATEYMSSARSLSSEE 

KLALLKQIQEAYGKCKEFGDDKVQLAMQTYEM 

VDKHIRRLDTDLARFEADLKEKQIESSDYDSSSS 

KGKKKGRTQKEKKAARARSKGKNSDEEAPKTA 

QKKLKLVRTSPEYGMPSVTFGSVHPSDVLDMPV 

DPNEPTYCLCHQVSYGEMIGCDNPDCSIEWFHFA 

CVGLTTKPRGKWFCPRCSQERKKK 

3109 

A 

1 

2613 

MVAVRAAGPREGASQDEAGTVWAPMTGCPCQC 

RPGPSWLLVDTLEPETAYPVQRPGPEQAGNQRL . 

QMKRAQFGPHDWLSLPVPPGPSWLLVDTLEPET 

AYQFSVLAQNKLGTSAFSEVVTVNTLAFPITTPEP 

LVLVTPPRCLIANRTQQGVLLSWLPPANHSFPIDR 

YIMEFRVAERWELLDDGIPGTEGEFFAKDLSQDT 

WYEFRVLAVMQDLISEPSNIAGVSSTDIFPQPDLT 

EDGLARPVLAGIVATICFLAAAILFSTLAACFVNK 

QRKRKLKRKKDPPLSITHCRKSLESPLSSGKVSPE 

SIRTLRAPSESSDDQGQPAAKKMLSPTREKELSL 

YKKTKRA1SSKKYSVAKAEAEAEATTPIELISRGP 

DGRFVMDPAEMEPSLKSRRIEGFPFAEETDMYPE 

FRQSDEENEDPLVPTSVAALKSQLTPLSSSQESYL 

PPPAYSPRFQPRGLEGPGGLEGRLQATGQARPPA 

PRPFHHGQYYGYLSSSSPGEVEPPPFYVPEVGSPL 

SSVMSSPPLPTEGPFGHPTIPEENGENASNSTLPLT 

QTPTGGRSPEPWGRPEFPFGGLETPAMMFPHQLP 

PCDVPESLQPKAGLPRGLPPTSLQVPAAYPGILSL 

EAPKGWAGKSPGRGPVPAPPAAKWQDRPMQPL 

VSQGQLRHTSQGMGIPVLPYPEPAEPGAHGGPST 

FGLDTRWYEPQPRPRPSPRQARRAEPSLHQVVLQ 

PSRLSPLTQSPLSSRTGSPELAARARPRPGLLQQA 

EMSEITLQPPAAVSFSRKSTPSTGSPSQSSRSGSPS 

YRPAMGFTTLATGYPSPPPGPAPAGPGDSLDVFG 

QTPSPRRTGEELLRPETPPPTLPTLGKLRRDRPAP 

ATSPPERALSKL 

3110 

A 

88 

924 

ILGSRTMSLTNTKTGFSVKDILDLPDTNDEEGSV 

AEGPEEENEGPEPAKRAGPLGQGALDAVQSLPL 

KNPFYDSSDNPYTRWLASTEGLQYSLHGLAAGA 

PPQDSSSKSPEPSADESPDNDKETPGGGGDAGKK 

RKRRVLFSKAQTYELERRFRQQRYLSAPEREHLA 

SLIRLTPTQVKIWFQOTRYKMKJRARAEKGMEVT 

PLPSPRRVAVPVLVRDGKPCHALKAQDLAAATF 

QAGIPFSAYSAQSLQHMQYNAQYSSASTPQYPT 

AHPLVQAQQWTW 

3111 

A 

595 

291 

PSVASLARRFSGRALWPPSHSVPGNRALCPRLLH 
GTTLPGGNQRELARQKNMKKQSDSVKGKRRDD 
GLSAAARKQRDSTPRDSEIMQQKQKKANEKKEE 
PK 

3112 

A 

3641 

1555 

APMLQIHHFSFKLIFQNIHKSKFISQRLSQNADST 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 

Predicted end 
nucleotide 
i oca r ion 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G-Glycine, H=Histidine, 
i-lsoleucine, K=Lysine, D=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





RHTNLSNTHYSDLIVWNCCLFFRNWCNEFFLKS 

CHFAQEREGSGDLCNSRAEKTKSAACVIFRRFPV 

APLIPYPLITKEDINAIEMEEDKRDL1SRE1SKFRDT 

HKKLEEEKGKKEKERQE1EKERRERERERERERE 

RREREREREREREREKEKERERERERDRDRDRTK 

ERDRDRDRERDRDRDRERSSDRNKDRSRSREKS 

RDRERERERERERERERERERERERERERERERE 

REREKDKKRDREEDEEDAYERRKLERKLREKEA 

AYQERLKNWEIRERKKTREYEKEAEREEERRRE 

MAKEAKRLKEFLEDYDDDRDDPKYYRGSALQK 

RLRDREKEMEADERDRKREKEELEEIRQRLLAE 

GHPDPDAELQRMEQEAERRRQPQIKQEPESEEEE 

EEKQEKEEKREEPMEEEEEPEQKPCLKPTLRPISS 

APSVSSASGNATPNTPGDESPCGI1IPHENSPDQQ 

OPFFHRPKTGT ST KT GASNSPfiOPNWTCRKlCT PV 

DSVFNKFEDEDSDDVPRKRKLVPLDYGEDDKNA 

TKGTVNTEEKRKHIKSL1EKIPTAKPELFAYPLDW 

SIVDSILMERJURPWINKKIIEYIGEEEATLVDLVC 

SKVMAHSPPQSILDDVAMVLDEEAEVFIVKMWR 

LLIYETEAKKIGLVK 

3113 

A 

1 

669 

VCAG1RDPCSTPLAKPAAGGAENLSFGKQPGLET 

MTT K'MTTPNI^TPPnAnPK'OI PRTOTVRFlHSnAV 
lNlJL/JVlvi i i r INrv 1 r r\JJ\Ur J\.v</LiC£\. 1 \J 1 V XvC. IUj V 

WSLSSCKPGFGVDQLRDDNLETYWQSDGSQPHL 
VNIQFRRKTTVKTLCIYADYKSDESYTPSKISVRV 
GNNFHNLQEIRQLELVEPSG WIH VPLTDNHKKPT 
RTFMIQIAVLANHQNGRDTHMRQIKIYTPVEESSI 
GKFPRCTTIDFMMYRSIR 

3114 

A 

I 

1613 

MTSKEESRRQQPTAGPAGQGKLPSPSEPQLPTPP 

TRSLHHFRRPLSPSREAQAHIAPSSELHLPQSQSA 

GPPPLGAGTEVELVVPGRDEGSRGALPGSSGVKF 

VWRKIVRFPVSDQVRTLSISRLMRRLLEMMQTL 

VQFIIGWRSLLGRTLGTIMNTMYVMMAQILRSH 

LIKATVIPNRVKMLPYFGIIRNRMMSTHKSKKKI 

REYYRLLNVEEGCSADEVRESFHKLAKQYHPDS 

GSNTADSATFIRIEKAYRKVLSHVIEQTNASQSK 

GEEEEDVEKFKYKTPQHRHYLSFEGIGFGTPTQR 

EKHYRQFRADRAAEQVMEYQKQKLQSQYFPDS 

VIVKNIROSKOOKITOAIERLVEDLrOF5?MAKGDF 

DNLSGKGKPLKCTSDCSYIDPMTHNLNRILIDNG 

YQPEWILKQKEISDTIEQLREAILVSRKKLGNPMT 

PTEKKQWNHVCEQFQENIRKLNKRINDFNLIVPI 

LTRQKVHFDAQKEIVRAQKIYETLIKTKEVTDRN 

PNNLDQGEGEKTPEIKKGFLNLMDLVEIY 

3115 

A 

1 

2036 

FRHRCGCLSYCRSRRGIRRVEPLRRARARVGPRF 

RPLCRMEIIRSNFKSNLHKVYQAIEEADFFAE)GE 

FSGISDGPSVSALTNGFDTPEERYQKLKKHSMDF 

LLFQFGLCTFKYDYTDSKYITKSFNFYVFPKPFNR 

SSPDVKFVCQSSSIDFLASQGFDFNKGFRKGIPYL 

NQEEERQLREQYDEKRSQANGAGALSYVSPNTS 

KCPVTIPEDQKKFIDQVVEKJEDLLQSEENICNLDL 

EPCTGFQRKLIYQTLSWKYPKGIHVETLETEKKE 

RYIVISKVDEEERKRREQQKHAKEQEELNDAVG 

FSRVIHAIA^SGKLVlGHNMLLDVlVffl 

PLPADLSEFKEMTTCVFPRLLDTKLMASTQPFKD 

IINNTSLAELEKRLKETPFNPPKVESAEGFPSYDT 
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SEQDD 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Pheny!alanine, G=Glycine, H=Histidine, 
I=fsoIc:ucinf* K^I^vsine. I^f .euc iiie iVl=IVtplhinninp 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valiae, W=Tryptophan, Y=Tyrosine, 
X=Unknown T *=Stop codon, /= possible nucleotide deletion, 
\=possib!e nucleotide insertion 





ASEQLHEAGYDAYITGLCFISMANYLGSFLSPPKI 

HVSARSICLIEPFFNKLFLMRVMDIPYLNLEGPDL 

QPKRDHVLHVTFPKEWKTSDLYQLFSAFGN1QIS 

WIDDTSAFVSLSQPEQVKIAVNTSKYAESYRIQT 

YAEYMGRKQEEKQIKRKWTEDSWKEADSKRLN 

PQCIPYTLQNHYYRNNSFTAPSTVGKRNLSPSQE 

EAGLEDGVSGEISDTELEQTDSCAEPLSEGRKKA 

KKLKRMKKELSPAGSISKNSPATLFEVPDTW 

3116 

A 

3 

1443 

TREAPMALAVAPWGRQWEEARALGRAVRMLQ 

RLEEQCVDPRLSVSPPSLRDLLPRTAQLLREVAH 

SRRAAGGGGPGGPGGSGDFLLIYLANLEAKSRQ 

VAALLPPRGRRSANDELFRAGSRLRRQLAKLAII 

FSHMHAELHALFPGGKYCGHMYQLTKAPAHTF 

WRESCGARCVLPWAEFESLLGTCHPVEPGCTAL j 

ALRTTIDLTCSGHVSIFEFDVFTRLFQPWPTLLKN 

WQLLAVNHPGYMAFLTYDEVQERLQACRDKPG 

SYIFRPSCTRLGQWAIGYVSSDGSILQTIPANKPLS 

QVLLEGQKDGFYLYPDGKTHNPDLTELGQAEPQ 

QRIHVSEEQLQLYWAMDSTFELCKICAESNKDV 

KIEPCGHLLCSCCLAAWQHSDSQTCPFCRCEIKG 

WEAVSIYQFHGQATAEDSGNSSDQEGRELELGQ 

VPLSAPPLPPRPDLPPRKPRNAQPKVRLLKGNSPP 

AALGPQDPAPA 

3117 

A 

296 

3547 

ERHSSPLLQHILTHALMRNKKHSNNWLAQHWF 

QSSIILCFSPVGRTLRVRARKFPA1VNCTAIDWFH 

AWPQEALVSVSRRFIEETKGIEPVHKDSISLFMAH 

VHTTVNEMSTRYYQNERRHNYTTPKSFLEQISLF 

KNLLKKKQNEVSEKKERLVNGIQKLKTTASQVG 

DLKARLASQEAELQLRNHDAEALITKIGLQTEKV 

SREKTIADAEERKVTAIQTEVFQKQRECEADLLK 

AEPAL VAATAALNTLNRVNLS ELKAFPNPPIA VT 

N VTA A VMVLLAPRGRVPKDRS WKAAK VFMGK 

VDDFLQALINYDKEHIPENCLKWNEHYLKDPEF 

NPNLIRTKSFAAAGLCAWVINIIKFYEVYCDVEP 

KRQALAQANLELAAATEKLEAIRKKLVVSANYD 

IEKSEKIRWGQSIKSFEAQEKTLCGDVLLTAAFVS 

YVGPFTRQYRQELVHCKWVPFLQQKVSIPLTEG 

LDLISMLTDDATIAAAVNNEGLPSDRMSTENAAIL 

THCERWPLVIDPQQQGIKWIKNKYGMDLKVTHL 

GQKGFLNAIETALAFGDVILIENLEETIDPVLDPL 

LGRNTIKKGKYimGDKECEFNKNFRLILHTKLAN 

PHYKPELQAQTTLLNFTVTEDGLEAQLLAEVVSI 

ERPDLEKLKLVLTKHQNDFK1ELKYLEDDLLLRL 

SAAEGSFLDDTKLVERLEATKTTVAEIEHKVIEA 

KENERKINEARECYRPVAARASLLYFVINDLQKI 

NPLYQFSLKAFNVLFHRAIEQADKVEDMQGRJSI 

LMESITFIAVFLYTSQALFEKDKLTFLSQMAFQIL 

LRKKEIDPLELDFLLRFTVEHTHLSPVDFLTSQSW 

SAIKAIAVMEEFRGIDRDVEGSAKQWRKWVESE 

CPEKEKLPQEWKKKSLIQKLILLRAMRPDRMTY 

ALRNFVEEKLGAKYVERTRLDLVKAFEESSPATP 

IFFILSPGVDALKDLEILGKRLGFTIDSGKFHNVSL 

GQGQETVAEVALEKASKGGHWVILQNVHLVAK 

WLGTLEKLLERFSQGSHRBYRVFMSAESAPTPD 

EHIIPQGLLENSIFaTNEPPTGMLANLHAALYNFD 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=G1utamic Acid, F=Phenylalanine, OGIycine, H=Histidinc, 
S=!soleucinc, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=L»lutamine, K— Argintne, o— serine, 
T=Threoninc, V=Va1ine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\-possible nucleotide insertion 





o 

3118 

A 

1 

226 

PYSLSTSCLGSPTSPRLEMDPNCSCATGGSCTCTG 
SCKCKECKCNSCKKSECGAISRNLGLSQVRGRKP 
ELGMEE 

3119 

A 

1254 

4133 ; 

PLATLTMEEQGHSEMEIIPSESHPHIQLLKSNREL 

LVTHIRNTQCLVDNLLKNDYFSAEDAEIVCACPT 

QPDKVRKILDLVQSKGEEVSEFFLYLLQQLADAY 

VDLRPWLLEIGFSPSLLTQSKVVVNTDPVSRYTQ 

QLRHHLGRDSKFVLCYAQKEELLLEEIYMDTIME 

LVGFSNESLGSLNSLACLLDHTTGILNEQGETIFIL 

GDAGVGKSMLLQRLQSLWATGRLDAGVKFFFH 

FRCRMFSCFKESDRLCLQDLLFKHYCYPERDPEE 

VFAFLLRFPHVALFTFDGLDELHSDLDLSRVPDS 

SCPWEPAHPLVLLANLLSGKLLKGASKLLTART 

GIEVPRQFLRKKVLLRGFSPSHLRAYARRMFPER 

ALQDRLLSQLEANPNLCSLCSVPLFCWIIFRCFQH 

FRAAFEGSPQLPDCTMTLTDVFLLVTEVHLNRM 

QPSSLVQRNTRSPVETLHAGRDTLCSLGQVAHR 

GMEKSLFVFTQEEVQASGLQERDMQLGFLRALP 

ELGPGGDQQSYEFFHLTLQAFFTAFFLVLDDRVG 

TQELLRFFQEWMPPAGAATTSCYPPFLPFQCLQG 

SGPAREDLFKNKDHFQFTNLFLCGLLSKAKQKLL 

RHLVPAAALRRKRKALWAHLFSSLRGYLNSLPR 

VQVESFNQVQAMPTFIWMLRCIYETQSQKVGQL 

AARG1CANYLKLTYCNACSADCSALSFVLHHFP 

KRLALDLDNNNLNDYGVRELQPCFSRLTVLRLS 

VNQ1TDGGVKVLSEELTKYKIVTYLGLYNNQITD 

VGAR i V 1 IviLUJbUJSAjL 1 riLKi^LjjviN jsj i or.vjvjJv i 

LALAVKNSKSISEVGMWGNQVGDEGAKAFAEA 

LRNHPSLTTLSLASNGISTEGGKSLARALQQNTSL 

EILWLTQNELNDEVAESLAEMLKVNQTLKHLWL 

IQNQITAKGTAQLADALQSNTGITEICLNGNLIKP 

EEAKVYEDEKRIICF 

3120 

A 

43 

1004 

QLWGFAAGSDSRPAMGCDGGTIPKRHELVKGPK 
KVEKVDKDAELVAQWNYCTLSQEILRRPIVACE 
LGRLYNKD A VIEFLLDKS AEKALGKA A SHIKSIK 
NVTELKLSDNPAWEGDKGNTKGDKHDDLQRAR 

r lC_,P V V Li LilIVlJN Lrjvilrvr i^ivov^\j^ v r oc>r^-rvj-»ivL.i 

KAEVCHTCGAAFQEDDVIVLNGTKEDVDVLKTR 

MEERRLRAKLEKKTKKPKAAESVSKPDVSEEAP 

GPSKVKTGKPEEASLDSREBCKTNLAPKSTAMNE 

SSSGKAGKPPCGATKRSIADSEESEAYKSLFTTHS 

SAKRSKEESAHWVTHTSYCF 

3121 

A 

3 

1490 

" " HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 
SQGVNDNEEGFFSARGHRPLDKXREEAPSLRPAP 
PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 
HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 
NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 
KDNE1WVNEYSSELEKHQLYIDETVNSNIPTNLR 
VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 
CNIPVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 
YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 
DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 
ISQLTRMGFTELLIEMEDWKGDKVKAHYGGFTV 
QNEANKYQISVNKYRGTAGNALMDGASQLMGE 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alamnc C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I— Isolencine \C—l .vsinp. I.=l purine (Vf=iV1rf»thinnir»r> 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T==Threonine, V=Vali*nc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 
EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 
AKIiGTDDGVV WMN WKGS WYSMKKMSMKIRP 
FFPQQ 

3122 

A 

3 

1490 

HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CN1PVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGG W WYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDG V V WMN WKGS WYSMKKMSMKIRP 

FFPQQ 

3123 

A 

3 

1490 

HASGPTRPVSWSFHKLKTMKHLLLLLLCVFLVK 

SQGVNDNEEGFFSARGHRPLDKKREEAPSLRPAP 

PPISGGGYRARPAKAAATQKKVERKAPDAGGCL 

HADPDLGVLCPTGCQLQEALLQQERPIRNSVDEL 

NNNVEAVSQTSSSSFQYMYLLKDLWQKRQKQV 

KDNENVVNEYSSELEKHQLYIDETVNSNIPTNLR 

VLRSILENLRSKIQKLESDVSAQMEYCRTPCTVS 

CN1PVVSGKECEEIIRKGGETSEMYLIQPDSSVKP 

YRVYCDMNTENGGWTVIQNRQDGSVDFGRKW 

DPYKQGFGNVATNTDGKNYCGLPGEYWLGNDK 

ISQLTRMGPTELLIEMEDWKGDKVKAHYGGFTV 

QNEANKYQISVNKYRGTAGNALMDGASQLMGE 

NRTMTIHNGMFFSTYDRDNDGWLTSDPRKQCSK 

EDGGGWWYNRCHAANPNGRYYWGGQYTWDM 

AKHGTDDG VV WMN WKGS WYSMKKMSMKIRP 

FFPQQ 

3124 

A * 

3 

544 

RVDDFVLLRSRLALRWLSHVRRPSRRVPRMPRG 

SRSRTSRMAPPASRAPQMRAAPRPAPVAQPPAA 

APPSAVGSSAAAPRQPGLMAQMATTAAGVAVG 

SAVGHTLGHAITGGFSGGSNAEPARPDITYQEPQ 

GTQPAQQQQPCLYEIKQFLECAQNQGDIKLCEGF 

NEVLKQCRLANGLA 

3125 

A 

3 

571 

GNSYNHRSLAAYPYMSHSQHSPYLQSYHNSSAA 

AQTRGDDTDQQKTTVIENGEIRFNGKGKK1RKPR 

TIYSSLQLQALNHRFQQTQYLALPERAELAASLG 

LTQTQ VKI WFQNKRSKFKKLLKQGSNPHESDPL 

QGSAALSPRSPALPPVWDVSASAKGVSMPPNSY 

MPGYSHWYSSPHQDTMQRPQMM 

3126 

A 

43 

5377 

LSVFFPIPVDGRDRGSNPSLESTSSELSTSTSEGSL 

SAMSGRNELHSRLHPHPQSSLIPMMFSPPESLLAS 

CELRGNFAEAHQVLFTFNLKSSPSSGELMFMERY 

QEVIQELAQVEHKIENQNSDAGSSTIRRTGSGRST 

LQAIGSAAAAGMVFYSISDVTDKLLNTSGDPEPM 

LQEDFWISTALVEPTAPLREVLEDLSPPAMAAFD 

LACSQCQLWKTCKQLLETAERRLNSSLERRGRRI 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alaninc C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^Glycinc, H=Histidtne, 
l-Isoleucine, KHLysine, L=Leucine, M=Methioninc, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=OThreonine, V— Valine, \V=Tryptophan, Y— Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 





dhvllnadgirgfpvvlqqiskslnyllmsasqt 

ksesveekgggpprcsitellqmcwpslsedcva 

shttlsqqldqvlqslrealelpeprtpplsslve 

qaaqkapeaeahpvqiqtqllqknlgkqtpsgs 

rqmdylgtffsycstlaavllqslssepdhvevk 

vgnpfvllqqsssqlvshllferqvpperlaall 

aqenlslsvpqvivsccceplalcssrqsqqtssl 

ltrlgtlaqlhashclddlplstpssprttenptl 

erkpyssprdsslpaltssalaflksrskllatva 

clgasprlkvskpslswkelrgrrevplaaeqv 

arecerlleqfplfeafllaaweplrgslqqgqs 

lavnlcgwaslstvllglhspialdvlseafees 

lvardwsralqltevygrdvddlssikdavlsc 

avacdkegwqylfpvkdaslrsrlalqfvdrw 

plescleilaycisdtavqeglkcelqrklaelq 

vyqkilglqsppvwcdwqtlrsccvedpstvmn 

mileaqeyelceewgclypiprehlislhqkhll 

hllerrdhdkalqllrrjpdptmclevteqsldq 

htsl atshfl an yltthf ygqlta vrhreiq aly 

vgskilltlpeqhrasyshlssnplfmleqllmn 

mkvdwatvavqtlqqllvgqeigftmdevdsl 

lsryaekaldfpypqrekrsdsvihlqeivhqaa 

dpetlprspsaefspaappgissihspslrersfppt 

qpsqefvppatpparhqwvpdetesicmvccreh 

ftmfmrhhcrrcgrlvcsscstkkmvvegcre 

nparvcdqcysycnkdvpeepsekjpealdsskse 

sppysfvvrvpkadevewildlkeeenelvrsef 

yyeqapsaslciailnlhrdsiacghqliehccrl 

skgltnpe vd a glltdimkqllfs a kmmf vkag 

qsqdlalcdsyiskvdvlnilvaaayrhvpsldq 

ilqpaavtrlrnqlleaeyyqlgvevstktgldt 

tgawhawgmaclkagnltaarekfsrclkppf 

dlnqlnhgsrlvqdvveylestvrpfvslqddd 

yfatlreleatlrtqslslavipegkimnntyyq 

eclfylhnystnlaiisfyvrhsclreallhllnk 

esppevftegifqpsyksgklhtlenllesedptles 

wgkyliaacqhlqkknyyhilyelqqfmkdqv 

raamtcirffshkaksytelgeklswllkakdh 

lkiylqetsrssgrkkttffrkkmtaadvsrhm 

ntlqlqmevtrflhrcesagtsqittlplptlfg 

dfqldaamtycraarqlvekekyseiqqllkcv 

sesgmaaksdgdtillncleafkrippqccfcsa 

qelegliqaihnddnkvrayliccklrsayliav 

kqehsratalvqqvqqaakssgdawqdicaq 

wlltshprgahgpgsrk 

3127 

A 

467 

1259 

HLGPPLAWIPAASLTSTKGEFGVEDDRPARGPPP 
PKSEEASWSESGVSSSSGDGPFAGGEVDKRLHQL 

K"TOT ATT T<s^T ATVTOFK' ( \RMF A 9YI Ani^K'K'Mlf 

QDLEDASNKAEEERARLEGELKGLQEQIAETKA 

RLITQQHDRAQEQSDHALMLRELQKLLQEERTQ 

RQDLELRLEETREALAGRAYAAEQMEGFELQTK 

QLTREVEELKSELQAIRDEKNQPDPRLQELQEEA 

ARLKSHFQAQLQQEMRKVIIHISFKHQPLT 

3128 

A 

1854 

798 

ASGSPAPSSSSAMAAACGPGAAGYCLLLGLHLFL 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
1 oca Hon 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=GIycine, H-Histidtne, 
I=JsoIeucine, K=Lysine, L=Leucine, M=Methioninc, 
N=Asparagine, P=ProIine, Q=Glutamme, R-Arginine, S=Sertnc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





LTAGPALGWNDPDRMLLRDVKALTLHYDRYTT 

SRRLDPIPQLKCVGGTAGCDSYTPKVIQCQNKG 

WDGYDVQWECKTDLDIAYKFGKTVVSCEGYES 

SEDQYVLRGSCGLEYNLDYTELGLQKLKESGKQ 

HGFASFSDYYYKWSSADSCNMSGLIT1VVLLGIA 

FVVYKLFLSDGQYSPPPYSEYPPFSHRYQRFTNS 

AGPPPPGFKSEFTGPQNTGHGATSGFGSAFTGQQ 

GYENSGPGFWTGLGTGGILGYLFGSNRAATPFSD 

SWYYPSYPPSYPGTWNRAYSPLHGGSGSYSVCS 

NSDTKTRTASGYGGTRRR 

3129 

A 

2340 

1192 

ELARRPKQQSSEKSRNMIRNWLTIFILFPLKLVEK 

CESSVSLTVPPVVKLENGSSTNVSLTLRPPLNATL 

V1TFEITFRSKNITJLELPDEVVVPPGVTNSSFQVT 

SQNVGQLTVYLHGNHSNQTGPRIRFLVIRSSAISI1 

NQVIGWIYFVAWSISFYPQVIMNWRRKSVIGLSF 

DFVALNLTGFVAYSVFNIGLLWVPYIKEQFLLKY 

PNGVNPVNSNDVFFSLHAVVLTLIIIVQCCLYERG 

GQRVSWPA1GFLVLAWLFAFVTMIVAAVGVITW 

LQFLFCFSYIKLAVTLVKYFPQAYMNFYYKSTEG 

WSIGNVLLDFTGGSFSLLQMFLQSYNNDQWTLIF 

GDPTKFGLGVFSIVFDVVFFIQHFCLYRKRPGYD 

QLN 

3130 

A 

31 

2026 

CWWPPLLPQLEPEPPPLRPRVAASQGGGMLGKG 

VVGGGGGTKAPKPSFVSYVRPEEIHTNEKEVTEK 

EVTLHLLPGEQLLCEASTVLKYVQEDSCQHGVY 

GRLVCTDFKIAFLGDDESALDNDETQFKNKVIGE 

NDITLHCVDQIYGVFDEKKKTLFGQLKKYPEKLII 

HCKDLRVFQFCLRYTKEEEVKRIVSGIIHHTQAP 

KLLKRLFLFSYATAAQNNTVTDPKNHTVMFDTL 

KDWCWELERTKGNMKYKAVSVNEGYKVCERL 

PAYFVVPTPLPEENVQRFQGHGIPIWCWSCHNGS 

ALLKMSALPKEQDDGILQIQKSFLDGIYKTIHRPP 

YEIVKTEDLSSNFLSLQEIQTAYSKFKQLFLIDNST 

EFWDTD1KWFSLLESSSWLDIIRRCLKKAIEITEC 

MEAQNMNVLLLEENASDLCCL1SSLVQLMMDPH 

CRTRIGFQSL1QKEWVMGGHCFLDRCNHLRQND 

KEEHQRQLSLPLTQSKSSPKRGFFREETDHLIKNL 

LGKRISKLINSSDELQDNFREFYDSWHSKSTDYH 

GLLLPHIEGPEIKVWAQRYLRWIPEAQILGGGQV 

ATLSKLLEMMEEVQSLQEKIDERHHSQQAPQAE 

APCLLRNSARLSSLFPFALLQRHSSKPVLPTSGW 

KALGDEDDLAKREDEFVDLGDV 

3131 

A 

126 

965 

QSRSRPRREGVGTGSRAVLCILATCGSKMSDIGD 

WFRSIPAITRYWFAATVAVPLVGKLGLISPAYLF 

LWPEAFLYRFQIWRPITATFYFPVGPGTGFLYLV 

NLYFLYQYSTRLETGAFDGRPADYLFMLLFNWI 

CIVITGLAMDMQLLM1PLIMSVLYVWAQLNRDM 

IVSFWFGTRFKACYLPWVILGFNYIIGGSVINELIG 

NLVGHLYFFLMFRYPMDLGGRNFLSTPQFLYRW 

LPSRRGGVSGFGVPPASMRRAADQNGGGGRHN 

WGQGFRLGDQ 

3132 

A 

2 

350 

FVAGWRALTAPSTSARLRAFGWQAAARLLVFG 
ARGVGLGSGAPGSLPCYLRMDALALLGGLVNV 
ARLPERWGPGRFDYWGNSHQrMHLLSVGSILQL 
HAGWPDLLWAAHHACPRD 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G==Glycine, H-Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 

3133 

A 

1 

2921 

MTCFKGQKGEQRSHAFEANKDHKAKVPSPNLYS 

QLNALQFTVDERSILWLNQFLLDLKQSLNQFMA 

VYKLNDNSKSDEHVDVRVDGLMLKFVIPSEVKS 

ECHQDQPRA1SIQSSEMIATNTRHCPNCRHSDLEA 

LFQDFKDCDFFSKTYTSFPKSCDNFNLLHPIFQRH 

AHEQDTKMHEIYKGNITPQLNKNTLKTSAATDV 

WAVYFSQFWIDYEGMKSGKGRPISFVDSFPLSIW 

ICQPTRYAESQKEPQTCNQVSLNTSQSESSDLAG 

RLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLF 

LHESLILLSENLRKDVEAVTGSPASQTSICIGILLR 

SAELALLLHPVDQANTLKSPVSESVSPVVPDYLP 

TENGDFLSSKRKQISRDINRIRSVTVNHMSDNRS 

MSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYL 

SDKHLGKISEDESSGLVYKSGSGEIGSETSDKKDS 

FYTDSSSVLNYREDSNILSFDSDGNQNILSSTLTS 

KGNET1 ESIFKAEDLLPEAASLSENLDISKEETPP V 

RTLKSQSSLSGKPKERCPPNLAPLCVSYKNMKRS 

SSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKG 

NKKNSTTNYRGTAESVNAGANLQNYGETSPDAI 

STNSEGAQENHDDLMSWVFKITGVNGEIDIRGE 

DTEICLQVNQVTPDQLGNISLRHYLCNRPVGSDQ 

KAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFL 

OCHTFTMF^TFFT T^<?T A/TNITOWFT FDFTV A TV AyfPtV/f 

K1QVSNTKINLKDDSPRSSTVSLEPAPVTVHIDHL 

VVERSDDGSFHIRDSHMLNTGNDLKENVKSDSV 

LLTSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEFSFDFTREQLMEENESLKQELAKAKMALAE 

AHLEKDALLHHIKKMTVE 

3134 

A 

9 

1579 

EEEGLSGGGPRVPCSLWGKQTMDYDFKAKLAA 

ERERVEDLFEYEGCKVGRGTYGHVYKARRKDG 

KDEKEYALKQIEGTGISMSACREIALLRELKHPN 

VIALQKVFLSHSDRKVWLLFDYAEHDLWHIIKFH 

RASKANKKPMQLPRSMVKSLLYQILDGIHYLHA 

NWVLHRDLKPANILVMGEGPERGRVKIADMGF 

ARLFN SPLKPL ADLDPV VVTFWYRAPELLLGAR 

HYTKAIDIWAIGCIFAELLTSEPIFHCRQEDIKTSN 

PFHHDQLDRIFSVMGFPADKDWEDIRKMPEYPT 

LQKDFRRTTYANSSLIKYMEKHKVKPDSKVFLL 

LOKLLTMDPTKRITSEOALODPYFOEDPLPTLDV 

FAGCQIPYPKJREFLNEDDPEEKGDKNQQQQQNQ 

HQQPTAPPQQAAAPPQAPPPQQNSTQTNGTAGG 

AGAGVGGTGAGLQHSQDSSLNQVPPNKKPRLGP 

SGANSGGPVMPSDYQHSSSRLNYQSSVQGSSQS 

QSTLGYSSSSQQSSQYHPSHQAHRY 

3135 

A 

3 

1111 

ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQ 

LSSRDPPGSLSAKKVRTEEKKAPRRVNGEGGSG 

GNSRQLQPPAAPSPQSYGSPASWSFAPLSAAPSPS 

SSRSSFSFSAGTAVPSSASASLSQPGPRKLLVPPTL 

LHAQPHHLLLPAAAAAASANAKSRRPKEKREKE 

RRRHGLGGAREAGGASREENGEVKPLPRDKIKD 

KIKERDKEKEREKKKHKVMNEIKKENGEVKILL 

KSGKEKPKTNIEDLQIKKVKKKKKKKHKENr^ 

KRPKMYSKSIQTICSGLLTDVEDQAAKGILNDNI 

KDYVGKNLDTKNYDSKIPENSEFPFVSLKEPRVQ 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 

lUldllUII 

corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 

I=IcAlpiitf*inp K==l vcinp I ^\ purine iVt— Mpf hinninf 

N=Asparagine, P=Proline, Q=Glut amine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 





NNLKRLDTLEFKQLIHIEHQPNGGASVIHCLQ 

3136 

A 

1442 

682 

TAAMSIFTPTNQIRLTNVAVVRMKRAGKRFEIAC 

YKNKVVGWRSGVEKDLDEVLQTHSVFVNVSKG 

QVAKKEDLISAFGTDDQTEICKQILTKGEVQVSD 

KERHTQLEQMFRDIATIVADKCVNPETKJIPYTVI 

LIERAMKDIHYSVKTNKSTKQQALEVIKQLKEK 

MKIERAHMRLRFILPWEGKKLKEKLKPLIKVIES 

EDYGQQLEIVCLIDPGCFREIDELIKKETKGKGSL 

EVLNLKDVEEGDEKFE 

3137 

A 

1 

3143 

MVEGKRHVLHGGRQERMRAKQKGKPLIKSSDL 

VRLIHYHHNSSPLHKQSSGPSSSPAAAAAPEKPG 

PKAAEVGDDFLGDFVVGERVWYNGVKPGVVQY 

LGETQFAPGQWAGVVLDDPVGKNDGAVGGVR 

YFECPALQGIFTRPSKLTRQPTAEGSGSDAHSVES 

LTAQNLSLHSGTATPPLTSRVIPLRESVLNSSVKT 

GNESGSNLSDSGSVKRGEKDLRLGDRVLVGGTK 

TGVVRYVGETDFAKGEWCGVELDEPLGKNDGA 

VAGTRYFQCPPKFGLFAPIHKVIR1GFPSTSPAKA 

KKTKRMAMGVSALTHSPSSSSISSVSSVASSVGG 

RPSRSGLLTETSSRYARKISGTTALQEALKEKQQ 

HIEQLLAERDLERAEVAKATSHICEVEKE1ALLK 

AQHEQYVAEAEEKLQRARLLVESVRKEKVDLSN 

QLEEERRKVEDLQFRVEEESITKGDLETQTQLEH 

ARIGELEQSLLLEKAQAERLLRELADNRLTTVAE 

KSRVLQLEEELTLRRGEIEELQQCLLHSGPPPPDH 

PDAAEILRLRERLLSASKEHQRESGVLRDKYEKA 

LKA YQA E VDKLRA ANEK YAQE V AGLKDKVQQ 

ATSENMGLMDNWKSKLDSLASDHQKSLEDLKA 

TLNSGPGAQQKEIGELKAVMEGIKMEHQLELGN 

LQAKHDLETAMHVKEKEALREKLQEAQEELAG 

LQRHWRAQLEVQASQHRLELQEAQDQRRDAEL 

RVHELEKLDVEYRGQAQAIEFLKEQISLAEKKML 

DYERLQRAEAQGKQEVESLREKLLVAENRLQAV 

EALCSSQHTHMIESNDISEETIRTKETVEGLQDKL 

NKRDKEVTALTSQTEMLRAQVSALESKCKSGEK 

KVDALLKEKRRLEAELETVSRKTHDASGQLVLIS 

QELLRKERSLNELRVLLLEANRHSPGPERDLSRE 

VHKAEWRIKEQKLKDDIRGLREKLTGLDKEKSL 

SDQRRYSLIDPSSAPELLRLQHQLMSTEDALRDA 

LDQAQQVEKLMEAMRSCPDKAQTIGNSGSANGI 

HQQDKAQKQEDKH 

3138 

A 

110 

2499 

QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

rTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLSGALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIEEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASWDIKLLLRJWDLFFYEGSRVLFQLTLGMLHL 

KEEEL1QSENSASIFNTLSDJPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYLIADQGQLLGA 

GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

1 Predicted end 
nucleotide 
location 
corresponding 
(o last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
^Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucinc, K-Lysine, L=Leuc»ne, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Argininc, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





VVSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWLFIEE 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 

3139 

A 

110 

2499 

QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSAL 

TPSIWPQEILAKYTQKEESAEQPEFYYDEFGFRV 

YKEEGDEPGSSLLANSPLMEDAPQRLRWQAHLE 

FTHNHDVGDLTWDKIAVSLPRSEKLRSLVLAGIP 

HGMRPQLWMRLS.GALQKKRNSELSYREIVKNSS 

NDETIAAKQIEKDLLRTMPSNACFASMGSIGVPR 

LRRVLRALAWLYPEIGYCQGTGMVAACLLLFLE 

EEDAFWMMSAIIEDLLPASYFSTTLLGVQTDQRV 

LRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAF 

ASVVDIKLLLR1WDLFFYEGSRVLFQLTLGMLHL 

KEEELIQSENSASIFNTLSDIPSQMEDAELLLGVA 

MRLAGSLTDVAVETQRRKHLAYL1ADQGQLLGA 

GTLTNLSQVVRRRTQRRKSTITALLFGEDDLEAL 

KAKNIKQTELVADLREAILRVARHFQCTDPKNCS 

VVSRQLPGLLPNTALTPPTPLVGLCSLWQELTPD 

YSMESHQRDHENYVACSRSHRRRAKALLDFERH 

DDDELGFRKNDIITIVSQKDEHCWVGELNGLRG 

WFPAKFVEVLDERSKEYSIAGDDSVTEGVTDLV 

RGTLCPALKALFEHGLKKPSLLGGACHPWT FTFF 

AAGREVERDFASVYSRLVLCKTFRLDEDGKVLT 

PEELLYRAVQSVNVTHDAVHAQMDVKLRSLICV 

GLNEQVLHLWLEVLCSSLPTVEKWYQPWSFLRS 

PGWVQIKCELRVLCCFAFSLSQDWELPAKREAQ 

QPLKEGVRDMLVKHHLFSWDVDG 

3140 

A 

1 

4939 

SAALGASLAIPRPGLPGVHGRGPGTLSGRAMEG 

AEPRARPERLAEAETRAADGGRLVEVQLSGGAP 

WGFTLKGGREHGEPLVITKIEEGSKAAAVDKLL 

AGDErVGINDIGLSGFRQEAICLVKGSHKTLKLV 

VKRRSELGWRPHSWHATKFSDSHPELAASPFTST 

SGCPSWSGRHHASSSSHDLSSSWEQTNLQRTLD 

HFSSLGSVDSLDHPSSRLSVAKSNSSIDHLGSHSK 

RDSAYGSFSTSSSTPDHTLSKADTSSAENILYTVG 

LWEAPRQGGRQAQAAGDPQGSEEKLSCFPPRVP 

GDSGKGPRPEYNAEPKLAAPGRSNFGPVWYVPD 

KKKAPSSPPPPPPPLRSDSFAATKSHEKAQGPVFS 

EAAAAQHFTALAQAQPRGDRRPELTDRPWRSAH 

PGSLGKGSGGPGCPQEAHADGSWPPSKDGASSR 

LQASLSSSDVRFPQSPHSGRHPPLYSDHSPLCADS 

LGQEPGAASFQNDSPPQVRGLSSCDQKLGSGWQ 

GPRPCVQGDLQAAQLWAGCWPSDTALGALESL 

PPPTVGQSPRHHLPQPEGPPDARETGRCYPLDKG 

AEGCSAGAQEPPRASRAEKASQRLAASITWADG 

ESSRICPQETPLLHSLTQEGKRRPESSPEDSATRPP 

PFDAHVGKPTRRSDRFATTLRNEIQMHRAKLQK 

SRSTVALTAAGEAEDGTGRWRAGLGGGTQEGPL 
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SEQED 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





AGTYKDHLKEAQARVLRATSFKJRRDLDPNPGDL 

YPESLEHRMGDPDTVPHFWEAGLAQPPSSTSGGP 

HPPRIGGRRRFTAEQKLKSYSEPEKMNEVGLTRG 

YSPHQHPRTSEDTVGTFADRWKFFEETSKPVPQR 

PAQKQALHGIPRDKPERPRTAGRTCEGTEPWSRT 

TSLGDSLNAHSAAEKAGTSDLPRRLGTFAEYQAS 

WKEQRKPLEARSSGRCHSADDILDVSLDPQERPQ 

HVHGRSRSSPSTDHYKQEASVELRRQAGDPGEP 

REELPSAVRAEEGQSTPRQADAQCREGSPGSQQ 

HPPSQKAPNPPTFSELSHCRGAPELPREGRGRAG 

TLPRDYRYSEESTPADLGPRAQSPGSPLHARGQD 

SWPVSSALLSKRPAPQRPPPPKREPRRYRATDGA 

PADAPVGVLGRPFPTPSPASLDVYVARLSLSHSPS 

VFSSAQPQDTPKATVCERGSQHVSGDASRPLPEA 

LLPPKQQHLRLQTATMETSRSPSPQFAPQKLTDK 

PPLLIQDEDSTRIERVMDKNTTVKMVPIKIVHSES 

QPEKESRQSLACPAEPPALPHGLEKDQIKTLSTSE 

QFYSRFCLYTRQGAEPEAPHRAQPAEPQPLGTQV 

PPEKDRCTSPPGLSYMKAKEKTVEDLKSEELARE 

IVGKDKSLADILDPSVKIKTTMDLMEGIFPKDEH 

LLEEAQQRRKLLPKIPSPRSTEERKEEPSVPAAVS 

LATNSTYYSTSAPKAELLIKMKDLQEQQEHEEDS 

GSDLDHDLSVKKOELIESISRKLOVLREARESLLE 

DVQANTVLGAEVEAIVKGVCKPSEFDKFRMFIG 

DLDKVVNLLLSLSGRLARVENALNNLDDGASPG 

DRQSLLEKQRVLIQQHEDAKELKENLDRRER1VF 

DILANYLSEESLADYEHFVKMKSAL11EQRELED 

KIHLGEEQLKCLLDSLQPERGK 

3141 

A 

97 

1894 

SPRGATMETPPLPPACTKQGHQKPLDSKDDNTE 

KHCPVTVNPWHMKKAFKVMNELRSQNLLCDVT 

IVAEDMEISAHRVVLAACSPYFHAMFTGEMSESR 

AKRVRKEVDGWTLRMLIDYVYTAEIQVTEENV 

QVLLPAAGLLQLQDVKKTCCEFLESQLHPVNCL 

GIRAFADMHACTDLLNKANTYAEQHFADVVLSE 

EFLNLGIEQVCSLISSDKLTISSEEKVFEAVIAWV 

NHDKDVRQEFMARLMEHVRLPLLPREYLVQRV 

EEEALVKNSSACKNYLIEAMKYHLLPTEQRILMK 

SVRTRLRTPMNLPKLMVWGGQAPKAIRSAECY 

DFKEQRWHQVAELPSRRCRAGMVYLAGLVFAV 

GGFNGSLRVRTVDSYDPVKDQWTSVANMRDRJR 

STLGAAVLNGLLYAVGGFDGSTGLSSVEAYNIKS 

NEWFHVAPMNTRRSSVGVGWGGLLYAVGGYD 

GASRQYLSTVECYNATTNEWTYIAEMSTRRSGA 

GVGVLNNLLYAVGGHDGPLVRKSVEVYDPTTN 

AWRQVADMNMCRRNAGVCAVNGLLYVVGGD 

DGSCNLASVEYYNPTTDKWTVVSSCMSTGRSYA 

GVTVIDKPL 

3142 

A 

1211 

1311 

FS^TTEKVAHAKEENLSMHQMLDQTLLELNN 
M 

3143 

A 

1809 

1041 

SEELDREKKLKEDSPRKTPNKJESGVPSLPVSLTSI 

KEEPKEAKHPDSQSMEESKLKNDDRKTPVNWK 

DSRGTRVAVSSPMSQHQSYJQYLHAYPYPQMYD 

PSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYGK 

MSGREETEKVNTSPSVNTKTTTESKALDLLQQH 

ANQYRSKSPAPVEKATAEREREAERERDRHSPFG 
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SEQ1D 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=isoleucine t K-Lysine, L=Leueine, M— Methionine, 
N^=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serinc, 
T=Threonine, V=Valine, VV=Tr>'ptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





QRHLHTHHHTHVGMGYPLIPGQYDPFQGLTSAA 
LVASQQVAAQASASGMFPGQRR 


A 
rV 

/o 


^v<?nrvT fit t pyt wft ^TsrvrwT nn^AonPFKBFY^ 

SVCVGREDDIKKSERMTAVVHDREVVIFYHKGE 

YHAMDIRCYHSGGPLHLGDIEDFDGRPCIVCPW 

HKYKITLATGEGLYQSINPKDPSAKPKWCSKGIK 

ORTHTVTVDNCiNlYVTl WFPFKCDSDFYATGDF 

KVIKSSS 

3145 

A 

2 

333 

RNSLLLPPLHLDNSTPAKMSCQQNQQQCQPPPK 
CPSPKCPPKSPVQCLPPASSGCAPSSGGCGPSSEG 
GCFLNHHRRHHRCRRQRPNSCDRGSGQQGGGS 
GCGHGSGGCC 

3146 

A 

3 

1151 

VCTALQEFGTRSTLLRCLDSGFRPGASRGLVGSW 

AAMESTLGAGIVIAEALQNQLAWLENVWLWITF 

LGDPKILFLFYFPAAYYASRRVG1AVLWISLITEW 

LNLIFKWFLFGDRPFWWVHESGYYSQAPAQVHQ 

FPSSCETGPGSPSGHCMITGAALWPIMTALSSQV 

ATRARSRWVRVMPSLAYCTFLLAVGLSRIFILAH 

FP140VT AHT TTHAVT fiWH MTPPVPN/fFPFF QFYH 
r r rl v * LAvjL.I 1 Vj/\ V L»vj W L/IVI l r rv V riViJDivc.L.or I vJ 

LTALALMLGTSLIYWTLFTLGLDLSWSISLAFKW 

CERPEWIHVDSRPFASLSRDSGAALGLG1ALHSPC 

YAQVRRAQLGNGQKIACLVLAMGLLGPLDWLG 

HPPQISLFYIFNFLKYTLWPCLVLALVPWAVHMF 

SAQEAPPIHSS 

3147 

A 

1437 

594 

RSFSLSFSLLSPSEMMALGAAGATRVFVAMVAA 
ALGGHPLLGVSATLNSVLNSNAIKNLPPPLGGAA 
GHPGSAVSAAPGILYPGGNKYQTIDNYQPYPCAE 
ncppnTnFvr a qptp nnnAf. vn tpt a r*p vpp^p 

Ux^cL^vJ 1 UC I LAor 1 l\.UVJL'/\<J V l^l^J^/\v^lVlSJVi\JVix 

CMRHAMCCPGNYCKNGICVSSDQNHFRGEIEETI 

TESFGNDHSTLDGYSRRTTLSSKMYHTKGQEGS 

VCLRSSDCASGLCCARHFWSKICKPVLKEGQVC 

TKHRRKGSHGLEIFQRCYCGEGLSCR1QKDHHQ 

ASNSSRLHTCQRH 

3148 

A 

1 

1562 

MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLH 

TPKLEHLDRVLYEWFLGKRSEGVPVSGPMLIEK 

AKDFYEQMQLTEPCVFSGGWLWRFKARHGIKK 

LDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 

EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGK 

DRLTVLMCANATGSHRLKPLA1GKCSGPRAFKGI 

QHLPVAYKAQGNAWVDKEIFSDWFHHIFVPSVR 

EHFRTIGLPEDSKAVLLLDSSRAHPQEAELVSSN 

VFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVP 

LQGPHARYNMNDAIFSVACAWNAVPSHVFRRA 

WRKT WP<sVAFAFG<?<?^FFFT FAFPFPVKPFTNKSF 

AffiLELVKEGSSCPGQLRQRQAASWGVAGREAE 

GGRPPAATSPAEVVWSSEKTPKADQDGRGDPGE 

GEEVAWEQAAVAFDAVLRFAERQPCFSAQEVG 

QLRALRAVFRSQQQVRRRRGALGAVVKVEALQ 

EGPGGCGATAQSPLPCSSTAGDN 

3149 

A 

1 JZ. 

419S 

VAVMISTAPLYSGVHNWTSSD1URMCGINEERRA 

PLSDEESTTGDCQHFGSQEFCVSSSFSKVELTAV 

GSGSNARGADPDGSATEKLGHKSEDKPDDPQPK 

MDYAGNVAEAEGLLVPLSSPGDGLKLPASDSAE 

ASNSRADCSWTPLNTQMSKQVDCSPAGVKALDS 

RQGVGEKNTFILATLGTGVPVEGTLPLVTTOFSP 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhcnylaIanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K— Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glu (amine, R=Arginine, S=Serinc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





LPAPICPPAPSSASVPHSVPDAFQAPVPPSAPTLVL 

APVPTPVLAPMPASTPPAAPAPPSVPMPTPTPSSG 

PPSTPTLIPAFAPTPVPAPTPAPIFTPAPTPMPAATP 

AAIPTSAPIPASFSLSRVCFPAAQAPAMQKVPLSF 

QPGTVLTPSQPLVYIPPPSCGQPLSVATLPTTLGV 

SSTLTLPVLPSYLQDRCLPGVLASPELRSYPYAFS 

VARPLTSDSKLVSLEVNRLPCTSPSGSTTTQPAPD 

GVPGPLADTSLVTASAKVLPTPQPLLPAPSGSSAP 

PHPAKMPSGTEQQTEGTSVTFSPLKSPPQLEREM 

ASPPECSEMPLDLSSK.SNRQKLPLPNQRKTPPMP 

VLTPVHTSSKALLSTVLSRSQRTTQAAGGNVTSC 

LGSTSSPFVIFPEIVRNGDPSTWVKNSTALISTIPG 

TYVGVANPVPASLLLNKDPNLGLNRDPRHLPKQ 

EPISIIDQGEPKGTGATCGKKGSQAGAEGQPSTV 

KRYTPARIAPGLPGCQTKELSLWKPTGPANIYPR 

CSVNGKPTSTQVLPVGWSPYHQASLLSIGISSAG ! 

QLTPSQGAPIRPTSVVSEFSGVPSLSSSEAVHGLP 

EGQPRPGGSFVPEQDPVTKNKTCR1AAKPYEEQV 

NPVLLTLSPQTGTLALSVQPSGGDIRMNQGPEES 

ESHLCSDSTPKMEGPQGACGLKLAGDTKPKNQV 

LATYMSHELVLATPQNLPKMPELPLLPHDSHPKE 

LILDVVPSSRRGSSTERPQLGSQVDLGRVKMEKV 

DGDVVFNLATCFRADGLPVAPQRGQAEVRAKA 

GQARVKQESVGVFACKNKWQPDDVTESLPPKK 

MKCGKEKDSEEQQLQPQAKAVVRSSHRPKCRK 

LPSDPQESTKKSPRGASDSGKEHNGVRGKHKHR 

KPTKPESQSPGKRADSHEEGSLEKKAKSSFRDFIP 

VVLSTRTRSQSDLKARKQKTSSSQSLEHRLRNRN 

LLLPNKVQGISDSPNGFLPNNLEEPACLENSEKPS 

GKRKCKTKHMATVSEEAKGKGRWSQQKTRSPK 

SPTPVKPTEPCTPSKSRSASSEEASESPTARQIPPE 

ARRLIVNKNAGETLLQRAARLGYKDVVLYCLQK 

DSEDVNHRDNAGYTALHEACSRGWTDILNILLE 

HGA 

3150 

A 

3 

2795 

SLRMHNLSILVRQIKFYYQETLQQLIMMSLPNVLI 

IGKNPFSEQGTEEVKKLLLLLLGCAVQCQKKEEF 

IERIQGLDFDTKAAVAAHIQEVTHNQENVFDLQ 

WMEVTDMSQEDIEPLLKNMALHLKRLIDERDEH 

SET1IELSEERDGLHFLPHASSSAQSPCGSPGMKR 

TESRQHLSVELADAKAEORRLRQELEEKTEQLLD 

CKQELEQMEIELKRLQQENMNLLSDARSARMYR 

DELDALREKAVRVDKLESEVSRYKERLHDIEFY 

KARVEELKEDNQVLLETKTMLEDQLEGTRARSD 

KLHELEKENLQLKAKLHDMEMERDMDRKKIEE 

LMEENMTLEMAQKQSMDESLHLGWELEQISRTS 

ELSEAPQKSLGHEVNELTSSRLLKLEMENQSLTK 

TVEELRTTVDSVEGNASKILKMEKENQRLSKKV 

EILENEIVQEKQSLQNCQNLSKDLMKEKAQLEKT 

IETLRENSERQIK1LEQENEHLNQTVSSLRQRSQIS 

AEARVKDIEKENKILHESIKETSSKLSKffiFEKRQI 

KKELEHYKEKGERAEELENELHHLEKENELLQK 

KITNLKITCEKJEALEQENSELERENRKLKKTLDS 

FKNLTFQLESLEKENSQLDEENLELRRNVESLKC 

ASMKMAQLQLENKELESEKEQLKKGLELLKASF 

KKTERLEVSYQGLDIENQRLQKTLENSNKKIQQL 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Iso!eucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Pro1ine, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





ESELQDLEMENQTLQKNLEELKISSKRLEQLEKE 

vn/CT t?rU7T%:ni WTWVTfl PfcTPMKRI RriOAPTVn 

TTLEENNVKIGNLEKENKTLSKEIGIYKESCVRLE 

ELEKENKELVKRATIDIKTLVTLREDLVSEKLKT 

QQMNNDLEKLTHELEKIGLNKERLLHDEQSTDD 

SRYKLLESKLESTLKKSLEIKEEK1AALEARLEES 

TNYNQQLRQELKTVKKK 

3151 

A- 

2 

2515 

GFWLHLTLLGASLPAALGWMDPGTSRGPDVGV 

GESQAEEPRSFEVTRREGLSSHNELLASCGKKFC 

SRGSRCVLSRKTGEPECQCLEACRPSYVPVCGSD 

GRFYENHCKLHRAACLLGKRITVIHSKDCFLKGD 

TCTMAGYARLKNVLLALQTRLQPLQEGDSRQDP 

ASQKRLLVESLFRDLDADGNGHLSSSELAQHVL 

KKQDLDEDLLGCSPGDLLRFDDYNSDSSLTLREF 

YMAFQWQLSLAPEDRVSVTTVTVGLSTVLTCA 

VHGDLRPPIIWKRNGLTLNFLDLEDINDFGEDDS 

LY1TKVTTIHMGNYTCHASGHEQLFQTHVLQVN 

VPPVIRVYPESQAQEPGVAASLRCHAEGIPMPRIT 

WLKNGVDVSTQMSKQLSLLANGSELHISSVRYE 

DTGAYTCIAKNEVGVDEDISSLFIEDSARKTLANI 

LWREEGLSVGNMFYVFSDDGI1VIHPVDCEIQRH 

LKPTEKIFMSYEEICPQREKNATQPCQWVSAVNV 

RNRYIYVAQPALSRVLVVDIQAHKVLQSIGVDPL 

PAKLSYDKSHDQVWVLSWGDVHKSRPSLQVITE 

ASTGQSQHLIRTPFAGVDDFFIPPTNLIINHIRFGFI 

FNKSDPAVHKVDLETMMPLKTIGLHHHGCVPQA 

MAH 1 HLOO Y r r lQCKl^ Ubr Ab A AKv^LL V JJb V 1 L> 

SVLGPNGDVTGTPHTSPDGRFIVSAAADSPWLHV 

QEITVRGEIQTLYDLQINSGISDLAFQRSFTESNQ 

YNIYAALHTEPDLLFLELSTGKVGMLKNLKEPPA 

GPAQPWGGTHRIMRDSGLFGQYLLTPARESLFLI 

NGRQNTLRCEVSG1KGGTTVVWVGEV 

3152 

A 

1 

2645 

GAGWQVSLTGRWSPGREAGAGEVRQDPGSTAA 

SPSSCDADLSARMARGERRRRAVPAEGVRTAER 

AARGGPGRRDGRGGGPRSTAGGVALAVVVLSL 

ALGMSGRWVLAWYRARRAVTLHSAPAVLPADS 

SSPAVAPDLFWGTYRPHVYFGMKTRSPKPLLTG 

LMWAQQGTTPGTPKLRHTCEQGDGVGPYGWEF 

HDGLSFGRQHIQDGALRLTTEFVKRPGGQHGGD 

WSWRVTVEPQDSGTSALPLVSLFFYVVTDGKEV 

LLPEVGAKGQLKFISGHTSELGDFRFTLLPPTSPG 

DTAPKYGSYNVFWTSNPGLPLLTEMVKSRLNSW 

FQHRPPGASPERYLGLPGSLKWEDRGPSGQGQG 

QFLIQQVTLKIPISIEFVFESGSAQAGGNQALPRLA 

GSLLTQALESHAEGFRERFEKTFQLKEKGLSSGE 

QVLGQAALSGLLGGIGYFYGQGLVLPDIGVEGSE 

QKVDPALFPPVPLFTAVPSRSFFPRGFLWDEGFH 

QLWQRWDPSLTREALGHWLGLLNADGWIGRE 

QILGDEARARVPPEFLVQRAVHANPPTLLLPVAH 

mt PvnnpnnT aft ai prt hawfswt tjosoa 

GPLPLSYRWRGRDPALPTLLNPKTLPSGLDDYPR 

ASHPSVTERHLDLRCWVALGARVLTRLAEHLGE 

AEVAAELGPLAASLEAAESLDELHWAPELGVFA 

DFGNHTKAVQLKPRPPQGLVRVVGRPQPQLQYV 

D ALGY V SLFPLLLRLLDPTSSRLGPLLD1L ADSRH 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serinc, 
T=Threonine, V^Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





LWSPFGLRSLAASSSFYGQRNSEHDPPYWRGAV 
WLNVNYLALGALHHYGHLEGPHQARAAKLHGE 
LRANVVGNVWRQYQATGFLWEQYSDRDGRGM 
GCRPFHGWTSLVLLAMAEDY 

3153 

A 

1 

4312 

M VIKTDELP AA APADS AREHGSQ AGG KG RPG A A 

AVLLADLERDARQGECALPGAAMAGLAPLKPE 

ASRSSSPGPTGCIRARVAAEAGTRNPGNAGAELE 

SWLPCCHGHPETPEPRGGQLPTAPELPSVMLLNG 

DCPESLKKEAAAAEPPRENGLDEAGPGDETTGQ 

EVIVIQDTGFSVKILAPGIEPFSLQVSPQEMVQEIH 

QVLMDREDTCHRTCFSLHLDGNVLDHFSELRSV 

EGLQEGSVLRVVEEPYTVREARIHVRHVRDLLKS 

LDPSDAFNGVDCNSLSFLSVFTDGDLGDSGKRK 

KGLEMDPIDCTPPEYILPGSRERPLCPLQPQNRD 

WKPLQCLKVLTMSGWNPPPGNRKMHGDLMYLF 

VITAEDRQVSITASTRGFYLNQSTAYHFNPKPASP 

RFLSHSLVELLNQISPTFKKNFAVLQKKRVQRHP 

FERIATPFQVYSWTAPQAEHAMDCVRAEDAYTS 

RLGYEEHIPGQTRDWNEELQTTRELPRKNLPERL 

LRERAIFKVHSDFTAAATRGAMAVIDGNVMAIN 

PSEETKMQMFIWNNIFFSLGFDVRDHYKDFGGD 

VAAYVAPTNDLNGVRTYNAVDVEGLYTLGTVV 

VDYRGYRVTAQSIIPGILERDQEQSVIYGSIDFGK 

TVVSHPRYLELLERTSRPLKILRHQVLNDRDEEV 

ELCSSVECKGIIGNDGRHYILDLLRTFPPDLNFLP 

VPGEELPEECARAGFPRAHRHKLCCLRQELVDA 

FVEHRYLLFMKLAALQLMQQNASQLETPSSLEN 

GGPSSLESKSEDPPGQEAGSEEEGSSASGLAKVK 

ELAET1AADDGTDPRSREVIRNACKAVGSISSTAF 

DIRFNPDIFSPGVRFPESCQDEVRDQKQLLKDAA 

AFLLSCQIPGL VKDCMEHA VLP VDG ATL AE VMR 

QRGINMRYLGKVLELVLRSPARHQLDHVFKJGIG 

ELITRSAKHIFKTYLQGVELSGLSAAISHFLNCFLS 

SYPNPVAHLPADELVSKKRNKRRKNRPPGAADN 

TAWAVMTPQELWKNICQEAKNYFDFDLECETV 

DQAVETYGLQKITLLREISLKTGIQVLLKEYSFDS 

RHKPA1TEEDVLNIFPVVKHVNPKASDAFHFFQS 

GQAKVQQGFLKEGCELINEALNLFNNVYGAMH 

VETCACLRLLARLHYIMGDYAEALSNQQKAVL 

MSERVMGTEHPNTIQEYMHLALYCFASSQLSTA 

LSLLYRARYLMLLVFGEDHPEMALLDNNIGLVL 

HGVMEYDLSLRFLENALAVSTKYHGPKALKVAL 

SHHLVARVYESKAEFRSALQHEKEGYT1YKTQL 

GEDHEKTKESSEYLKCLTQQAVALQRTMNEIYR 

NGSSANIPPLKFTAPSMASVLEQLNVINGILFIPLS 

QIODLEmKAEVARRHQLQEASRNRDRAEEPMA 

TEPAPAGAPGDLGSQPPAAKDPSPSVQG 

3154 

A 

416 ; 

4082 

KFKl.IKIMLLTLnLLPVVSKPSFVSLSAPQHWSCP 

EGTLAGNGNSTCVGPAPFLIFSHGNSIFRIDTEGT 

NYEQLVVDAGVSVIMDFHYNEKRIYWVDLERQ 

LLQRVFLNGSRQERVCNDEKNVSGMArNWINEEV 

IWSNQQEGIITVTDMKGNNSHILLSALKYPANVA 

VDPVERFIFWSSEVAGSLYRADLDGVGVKALLE 

TSEKITAVSLDVLDKRLFWIQYNREGSNSLICSCD 

YDGGSVHISKHPTQHNLFAMSLFGDRIFYSTWK 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
iocation 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=Glycine, H=Htstidine, 
i=isoleucine, K=Lysine, L= Leu cine, M=Methionine, 
N=Asparaginc, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosine, 
X^Un known, *-Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





MKTIW^NKHTGKDMVRINLHSSFVPLGELKVV 

HPLAQPKAEDDTWEPEQKLCKLRKGNCSSTVCG 

QDLQSHLCMCAEGYALSRDRKYCEGNDWKYCE 

DVNECAFWNHGCTLGCKNTPGSYYCTCPVGFVL 

LPDGKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCSSPDNGGCSQLCVPLSP 

VSWECDCFPGYDLQLDEKSCAASGPQPFLLFANS 

QDIRHMHFDGTDYGTLLSQQMGMVYALDHDPV 

ENK1YFAHTALKWIERANMDGSQRERLIEEGVD 

VPEGLAVDWIGRRFYWTDRGKSLIGRSDLNGKR 

SKIITIENISQPRGIAVHPMAKRLFWTDTGINPRIE 

SSSLQGLGRLV1ASSDLIWPSGITIDFLTDKLYWC 

DAKQSVIEMANLDGSKRRRLTQNDVGHPFAVA 

VFEDYVWFSDWAMPSVIRVNKRTGKDRVRLQG 

SMLKPSSLVVVHPLAKPGADPCLYQNGGCEHIC 

KKRLGTAWCSCREGFMKASDGKTCLALDGHQL 

LAGGEVDLKNQVTPLDILSKTRVSEDNITESQHM 

LVAEIMVSDQDDCAPVGCSMYARCISEGEDATC 

QCLKGFAGDGKLCSDIDECEMGVPVCPPASSKCI 

NTEGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPDSTP 

PPHLREDDHHYSVRNSDSECPLSHDGYCLHDGV 

CMYIEALDKYACNCVVGYIGERCQYRDLKWWE 

LRHAGHGQQQKVIWAVCVWLVMLLLLSLWG 

AHYYRTQKLLSKNPKNPYEESSRDVRSRRPADT 

EDGMSSCPQPWFVVIKEHQDLKNGGQPVAGED 

GQAADGSMQPTSWRQEPQLCGMGTEQGCWIPV 

SSDKGSCPQVMERSFHMPSYGTQTLEGGVEKPH 

SLLSANPLWQQRALDPPHQMELTQ 

3155 

A 

533 

212 

GTSGWYWERLAERRGRLWSREEAMATMENKVI 
CALVLVSMLALGTLAEAQTETCTVAPRERQNCG 
FPGVTPSQCANKGCCFDDTVRGVPWCFYPNTID 
VPPEEECEF 

3156 

A 

2 

1585 

PRVRAADVAAGAQAWSAGMAKSNGENGPRAP 

AAGESLSGTRESLAQGPDAATTDELSSLGSDSEA 

NGFAERRIDKFGFIVGSQGAEGALEEVPLEVLRQ 

RESKWLDMLNNWDKWMAKKHKKIRLRCQKGI 

PPSLRGRAWQYLSGGKVKLQQNPGKFDELDMSP 

GDPKWLDVIERDLHRQFPFHEMFVSRGGHGQQD 

LFRVLKAYTLYRPEEGYCQAQAPIAAVLLMHMP 

AEQAFWCLVQICEKYLPGYYSEKLEAIQLDGE1L 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCA 

FSRTLPWSSVLRVWD^1FFCEGVK1IFRVGLVLLK 

HALGSPEKVKACQGQYETIERLRSLSPKIMQEAF 

LVQEVVELPVTERQIEREHLLQLRRWQETRGELQ 

CRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLD 

APLPGSKAKPKPPKQAQKEQRKQMKGRGQLEKP 

PAPNQAMVVAAAGDACPPQHVPPKDSAPKDSAP 

QDLAPQVSAHHRSQESLTSQESEDTYL 

3157 

A 

3 

601 

SSAMGSRSSHAAVBPDGDSIRRETGFSQASLLRLH 

HRFRALDRNKKGYLSRMDLQQIGALAlVNPLGDR 

IDESFFPDGSQRVDFPGFVRVLAHFRPVEDEDTET 

QDPKKPEPLNSRRNKLHYAFQLYDLDRDGKISR 

HEMLQVLRLMVGVQVTEEQLENIADRTVQEAD 

EDGDGAVSFVEFTKSLEKMDVEHKMSIRILK 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to First amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutomic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoicucine, K=Lysine, L=Leucine, M— Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine, V=Valinc, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 

3158 

A 

2 

409 

ISSCPHTAYEGSMSTLSNFTQTLEDVFRR1FITYM 
DNWRQNTTAEQEALQAKVDAENFYTVILYLMV 
MIGMFSFIIVAILVSTVKSKRREHSNDPYHQYIVE 
DWQEKYKSQILNLEESKAT1HENIGAAGFKMSP 

3159 

A 

3 

416 

PWGAAELDMGRRDAQLLAALLVLGLCALAGSE 

KPSPCQCSRLSPHNRTNCGFPGITSDQCFDNGCCF 

DSSVTGVPWCFHPLPKQESDQCVMEVSDRRNCG 

YPGISPEECASRKCCFSNFIFEVPWCFFPKSVEDC 

HY 

3160 

A 

179 

409 

KPKTKILKMVYYPELFVWVSQEPFPNKDMEGRL 
PKGRLPVPKEVNRKKNDETNAASLTPLGSSELRS 
PRISYLHFF 

3161 

A 

683 

1186 

LSSTGGLHAAACAAAMSLVIPEKFQHILRVLNTN 

IDGRRKIAFAITAIKGVGRRYAHVVLRXADIDLT 

KRAGELTEDEVERVITIMQNPRQYKIPDWFLNRQ 

KDVKDGKYSQVLANGLDNKLREDLERLKKIRA 

HRGLRHFWGLRVRGQHTKTTGRRGRTVGVSKK 

K 

3162 

A 

t 

1938 

GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWRVP 

GRLLLLLLPALCCLPGAARAAAAAAGAGNRAA 

VAVAVARADEAEAPFAGQNWLKSYGYLLPYDS 

RASALHSAKALQSAVSTMQQFYGIPVTGVLDQT 

TIEWMKKPRCGVPDHPHLSRRRRNKRYALTGQK 

WRQKHITYSIHNYTPKVGELDTRKAIRQAFDVW 

QKVTPLTFEEVPYHEIKSDRKEAD1MIFFASGFHG 

DSSPFDGEGGFLAHAYFPGPG1GGDTHFDSDEPW 

TLGNANHDGNDLFLVAVHELGHALGLEHSSDPS 

AIMAPFYQYMETHNFKLPQDDLQGIQKIYGPPAE 

PLEPTRPLPTLPVRRIHSPSERKHERQPRPPRPPLG 

DRPSTPGTKPNICDGNFNTVALFRGEMFVFKDR 

WFWRLRNNRVQEGYPMQIEQFWKGLPARIDAA 

YERADGRFVFFKGDKYWVFKEVTVEPGYPHSLG 

ELGSCLPREGIDTALRWEPVGKTYFFKGERYWR 

YSEERRATDPGYPKPITVWKGIPQAPQGAFISKE 

GYYTYFYKGRDYWKFDNQKLSVEPGYPRNILRD 

WMGCNQKEVERRKERRLPQDDVDIMVTINDVP 

GSVNAVAVVIPCILSLCILVLVYTIFQFKNKTGPQ 

PVTYYKRPVQEWV 

3163 

A 

1235 

2223 

SRLSLQFYVSFRRTGLFTCKLIVEIFFRNYMNDSL 

RTNVFVRFQPETIACACIYLAARALQIPLPTRPHW 

FLLFGTTEEEIQEICIETLRLYTRKKPNYELLEKEV 

EKRKVALQEAKLKAKGLNPDGTPALSTLGGFSP 

ASKPSSPREVKAEEKSPISINVKTVKKEPEDRQQA 

SKSPYNGVRKDSKRSRNSRSASRSRSRTRSRSRS 

HTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRR 

HHNHGSPHLKAKHTRDDLKSSNRHGHKRKKSRS 

RSQSKSRDHSDAAKKHRHERGHHRDRRERSRSF 

ERSHKSKHHGGSRSGHGRHRR 

3164 

A 

3 

3274 

DCRLQAAMPTNFTVVPVEAHADGGGDETAERT l . 

EAPGTPEGPEPERPSPGDGNPRENSPFLNNVEVE 

QESFFEGKNMALFEEEMDSNPMVSSLLNKLANY 

TNLSQGVVEHEEDEESRRREAKAPRMGTFIGVY 

LPCLQNILGVILFLRLTWIVGVAGVLESFLIVAMC 

CTCTMLTAISMSAIATNGVVPAGGSYYMISRSLG 

PEFGGAVGLCFYLGTTFAGAMYILGTIEIFLTYISP 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
i=iso!eucine, K=Lysine, L-Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





GAAIFQAEAAGGEAAAMLHNMRVYGTCTLVLM 

ALWFVGVKYVNKLALVFLACVVLSILAIYAGVI 

KSAFDPPDIPVCLLGNRTLSRRSFDACVKAYGIH 

NNSATSALWGLFCNGSQPSAACDEYFIQNNVTEI 

QGIPGAASGVFLENLWSTYAHAGAFVEKKGVPS 

VPVAEESRASTLPYVLTDIAASFTLLVGIYFPSVT 

GIMAGSNRSGDLKDAQKSIPTGTILAIVTTSFIYLS 

CIVLFGACIEGVVLRDKFGEALQGNLVIGMLAW 

PSPWVIVIGSFFSTCGAGLQTLTGAPRLLQAIARD 

GIVPFLQVFGHGKANGEPTWALLLTVLICETGILI 

ASLDSVAPDLSMFFLMCYLFVNLACAVQTLLRTP 

NWRPRFKFYHWTLSFLGMSLCLALMFICSWYYA 

LSAMLIAGCIYKYIEYRGAEKEWGDGIRGLSLNA 

ARYALLRVEHGPPHTKNWRPQVLVMLNLDAEQ 

AMKHPRLLSFTSQLKAGKGLTIVGSVLEGTYLD 

KHMEAQRAEENIRSLMSTEKTKGFCQLVVSSSLR 

DGMSHLIQSAGLGGLKHNTVLMAWPASWKQED 

OTFSWKNFVDTVRDTTAAHQALLVAKNVDSFPQ 

NQERFGGGHIDVWWIVHDGGMLMLLPFLLRQH 

KVWRKCRMR1FTVAQVDDNSIQMKKDLQMFLY 

QMLKQMQLSKNEQEREAQLIHDRNTASHTAAA 

ARTQAPPTPDKVQMTWTREKLIAEKYRSRDTSL 

SGFKDLFSMKPDQSNVRRMHTAVKLNGVVLNK 

SQDAQLVLLNMPGPPKNRQGDENYMEFLEVLTE 

GLNRVLLVRGGGREVITIYS 

3165 

A 

3 

2681 

GRGARGGSGAGALRGCRGYLQKLSGKGPSRGY 

RSRWFVFDARRCYLYYFKSPQDALPLGHLDIAD 

ACFSYQGPDEAAEPGTEPPAHFQVHSAGAVTVL 

KAPNRQLMTYWLQELQQKRWEYCNSLDMVKW 

DSRTSPTPGDFPKGLVARDNTDLIYPHPNASAEK 

ARNVLAVETVPGELVGEQAANQPAPGHPNSINF 

YSLKQWGNELKNSMSSFRPGRGHNDSRRTVFYT 

NEEWELLDPTPKDLEESIVQEEKKKLTPEGNKGV 

TGSGFPFDFGRNPYKGKRPLKDIIGSYKNRHSSG 

DPSSEGTSGSGSVSIRKPASEMQLQVQSQQEELE 

QLKKDLSSQKELVRLLQQTVRSSQYDKYFTSSRL 

CEGVPKDTLELLHQKDDQILGLTSQLERFSLEKE 

SLQQEVRTLKSKVGELNEQLGMLMETIQAKDEV 

IDCLSEGEGNGPPPTVAPSSPSVVPVARDQLELDR 

LKDNLQGYKTQNKFLNKEILELSALRRNPERRER 

DLMARNSSLEAKLCQIESKYLILLQEMKTPVCSE 

DQGPTREVIAQLLEDALQVESQEQPEQAFVKPHL 

VSEYDIYGFRTVPEDDEEEKLVAKVRALDLKTL 

YLTENQEVSTGVKWERYFASTVNREMMCSPEL 

KNLIRAGIPHEHRSKVWKWCVDRHTRKFKDNTE 

PGHFQTLLQKALEKQNPASKQIELDLLRTLPNNK 

MV^rPT^FGTOKT RNVT T AFSWRNPDIGYCOGLN 

RLVAVALLYLEQEDAFWCLVTIVEVFMPRDYYT 

KTLLGSQVDQRVFRDLMSEKLPRLHGHFEQYKV 

DYTLITFNWFLVVFVDSVVSD1LFK1WDSFLYEGP 

KVIFRFALALFKYKEEEILKLQDSMSIFKYLRYFT 

RTDLDARSGTDAPTTWRKSGWS 

3166 

A 

10 

4070 

FPGPTISSNSQLYRASALFETIRHEAQLSTDYKLS 
LFDLQTSSYQALQRVLVSLGHHDEALAVAERGR 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
locution 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine OCysteine, D=Asparttc Acid, 
E=Glutamic Acid, F=Phenylalanine, G-Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknovvn, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





TRAFADLLVERQTGQQDSDPYSPVTIDQILEMVN 

GQRGLVLYYSLAAGYLYSWLLAPGAGIVKFHEH 

YLGENTVENSSDFQASSSVTLPTATGSALEQHIAS 

VREALGVESHYSRACASSETESEAGDIMDQQFEE 

M>WKI.NSVTDPTGFLRMVRR>INLF^ 

LFSNTVSPTQDGTSSLPRRQSSFAKPPLRALYDLL 

IAPMEGGLMHSSGPVGRHRQLILVLEGELYL1PF 

ALLKGSSSNEYLYERFGLLAVPSIRSLSVQSKSHL 

RKNPPTYS SSTSMA A VIGNPKLP S A VMDRWL WG 

PMPSAEEEAYMVSELLGCQPLVGSVATKERVMS 

ALTQAECVHFATHISWKLSALVLTPSMDGNPASS 

KSSFGHPYTIPESLRVQDDASDGESISDCPPLQEL 

LLTAADVLDLQLPVKLVVLGSSQESNSKVAADG 

VIALTRAFLAAGAQCVLVSLWPVPVAAFKMFIH 

AFYSSLLNGLKASAALGEAMKVVQSSKAFSHPS 

NWAGFMLIGSDVKLNSPSSLIGQALTEILQHPER 

ARDALRVLLHLVEKSLQRJQNGQRNAMYTSQQS 

VENKVGGIPGWQALLTAVGFRLDPPTSGLPAAV 

FFPTSDPGDRLQQCSSTLQSLLGLPNPALQALCK 

LITASETGEQLISRAVKNMVGMLHQVLVQLQAG 

EKEQDLASAP1QVSISVQLWRLPGCHEFLAALGF 

VLCEVGQEEVILKTGKQANRRTVHFALQSLLSLF 

DSTELPKRLSLDSSSSLESLASAQSVSNALPLGYQ 

QPPFSPTGADSIASDAISVYSLSSIASSMSFVSKPE 

GGSEGGGPGGRQDHDRSKNAYLQRSTLPRSQLP 

PQTRPAGNKDEEEYEGFSIISNEPLATYQENRNTC 

FSPDHKQPQPGTAGGMRVSVSSKGSISTPNSPVK 

MTLIPSPNSPFQKVGKLASSDTGESDQSSTETDST 

VKSQEESNPKLDPQELAQKILEETQSHLIAVERLQ 

RSGGQVSKSNNPEDGVQAPSSTAVFRASETSAFS 

RPVLSHQKSQPSPVTVKPKPPARSSSLPKVSSGYS 

SPTTSEMSIKDSPSQHSGRPSPGCDSQTSQLDQPL 

FKLKYPSSPYSAHISKSPRNMSPSSGHQSPAGSAP 

SPALSYSSAGSARSSPADAPDIDKLKMAAIDEKV 

QAVHNLKMFWQSTPQHSTGPMKIFRGAPGTMTS 

KRDVLSLLNLSPRPNKKEEGVDKLELKELSLQQH 

DGAPPKAPPNGHWRTETTSLGSLPLPAGPPATAP 

ARPLRLPSGNGYKFLSPGRFFPSSKC 

3167 

A 

1 

762 

AARRRQKGKEENMMMDLFETGSYFFYLDGENV 

TLQPLEVAEGSPLYPGSDGTLSPCQDQMPPEAGS 

DSSGEEHVLAPPGLQPPHCPGQCLIWACKTCKRK 

SAPTDRRKAATLRERRRLKKINEAFEALKRRTVA 

NPNQRLPKVEILRSAISYIERLQDLLHRLDQQEK 

MQELGVDPFSYRPKQENLEGADFLRTCSSQWPS 

VSDHSRGLVITAKEGGASIDSSASSSLRCLSSIVDS 

ISSEERKLPCVEEVVEK 

3168 

A 

701 

246 

TSRRVTMKIWFVTSDRSIG^RKRHFNAPSHVRR 

KIMSSPLSKELRQKYNVRSMPIRKDDEVQVVRG 

HYKGQQIGKVVQVYRKKYVIYIERVQREKANGT 

TVHVGIHPSKVVITRLKLDKDRKKILERKAKSRQ 

VGKEKGKYKEELIEKMQE 

3169 

A 

156 

3168. 

GPGGAISLSVEAKAGADLLVKGKQARMDIYDTQ 
TLGWVFGGFMVVSAIGIFLVSTFSMKETSYEEA 
LANQRKEMAKTHHQKVEKKKKEKTVEKKGKT 
KKKEEKPNGKIPDHDPAPNVTVLLREPVRAPAV 
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SEQ JD 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
!=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P-Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibIe nucleotide insertion 





AVAPTPVQPPIIVAPVATVPAMPQEKLASSPKDK 

KKKEKKVAKVEPAVSSVVNSIQVLTSKAAILETA 

PKEGRNTDVAQSPEAPKQEAPAKKKSGSKKKGP 

PDADGPLYLPYKTLVSTVGSMVFNEGEAQRLIEI 

LSEKAGI1QDTWHKATQKGDPVAELKRQLEEKEK 

LLATEQEDAAVAKSKLRELNKEMAAEKAKAAA 

GEAKVKKQLVAREQE1TAVQARMQASYREHVK 

EVQQLQGKIRTLQEQLENGPNTQLARLQQENSIL 

RDALNQATSQVESKQNAELAKLRQELSKVSKEL 

VEKSEAVRQDEQQRKALEAKAAAFEKQVLQLQ 

ASHRESEEALQKRLDEVSRELCHTQSSHASLRAD 

AEKAQEQQQQMAELHSKLQSSEAEVRSKCEELS 

GLHGQLQEARAENSQLTERIRSIEALLEAGQARD 

AQDVQASQAEADQQQTRLKELESQVSGLEKEAI 

ELREAVEQQKVKNNDLREKNWKAMEALATAEQ 

ACKEKLHSLTQAKEESEKQLCLIEAQTMEALLAL 

LPELSVLAQQNYTEWLQDLKEKGPTLLKHPPAP 

AEPSSDLASKLREAEETQSTLQAECDQYRSILAET 

EGMLRDLQKSVEEEEQVWRAKVGAAEEELQKS 

RVTVKHLEEIVEKLKGELESSDQVREHTSHLEAE 

LEKHMAAASAECQNYAKEVAGLRQLLLESQSQL 

HAAK^FAOKO^nFT Af VROOT ^FMK"<5HVFr>f;r>T 

AGAPASSPEAPPAEQDPVQLKTQLEWTEAILEDE 

QTQRQKLTAEFEEAQTSACRLQEELEKLRTAGPL 

ESSETEEASQLKERLEKEKKLTSDLGRAATRLQE 

LLKTTQEQLAREKDTVKKLQEQLEKAEDGSSSK 

EGTSV 

3170 

A 

6730 

4027 

THASEKYSYGHLPTHSITAHPMVTIRISDRQRLIQ 

PYIHNYSWLLFAALALYSAHLASAEDVDGEKLD 

PQTRSSATTLRSQCMQLVGDCLMKAHQGKGLK 

ALALLGVLPDGDSSLEDHALPVTVPTGASEEQLE 

KKAVQGAELSEAGNGKRAVHEEIRPVDFKQRNK 

ADKGVSLSKDPSCQTQISDSPADASPPTGLPDAE 

DSEVSSQKPIEEKAVTPSPEQVFAECSQKR1LGLL 

AAMLPPLKSGPTVPLIDLEITVLPLMFQVVISNAG 

HLNETYHLTLGLLGQLIIRLLPAEVDAAVIKVLSA 

KHNLFAAGDSS1VPDGWKTTHLLFSLGAVCLDS 

RVGLDWACSMAEILRSLNSAPLWRDVIATFTDH 

CIKQLPFQLKHTNIFTLLVLVGFPQVLCVGTRCV 

YMDNANEPHNV1ILKHFTEKNRAVIVDVKTRKR 

KTVKDYQLVQKGGGQECGDSRAQLSQYSQHFA 

FIASHLLQSSMDSHCPEAVEATWVLSLALKGLY 

KTLKAHGFEEIRATFLQTDLLKLLVKKCSKGTGF 

SKTWLLRDLEILSIMLYSSKKEINALAEHGDLEL 

DERGDREEEVERPVSSPGDPEQKKLDPLEGLDEP 

TRICFLMAHDALNAPLHELRAIYELQMKKTDYFF 

LEVQKRFDGDELTTDERIRSLAQRWQPSKSLRLE 

EQSAKAVDTDMIILPCLSRPARCDQATAESNPVT 

OKI 1SSTESEI OOSYAKORRSKSAAT T HKFT NOK 

SKRAVRDYLFRVNEATAVLYARHVLASLLAEWP 

SHVPVSEDILELSGPAHMTYILDMFMQLEEKHE I 

WEKWMQTELVLTHQVLPLPHRLPPVSASWSEA 

TCVAVQLPDRCECSKGRVTVSSPKDWASEELRG 

PERDFQLNQKALSPSSQFPSAEILRHIR 

3171 

A 

557 

89 

GTRAGPVKDREAFQRLNFLYQAAHCVLAQDPEN 
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SEQ1D 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leueine, M=Methionine, 
N-Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, VV=Tryptophan, Y=Tyrosine, 
X=Unkno>vn, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





QALARFYCYTERTIAKRLVLRRDPSVKRTLCRGC 
SSLLVPGLTCTQRQRRCRGQRWTVQTCLTCQRS 
QRFLNDPGHLLWGDRPEAQLGSQADSKPLQPLP 
NTAHS1SDRLPEEKMQTQGSSNQ 

3172 

A 

2 

496 

FRRAGAGRGRRRGEVTSPLSPEPLAFQSLATSRR 

PEPQTTQTVRSSALPAPPASPMSQYAPSPDFKRA 

LDSSPEANTEDDKTEEDVPMPKNYLWLTIVSCFC 

PAYP1NIVALVFSIMSLNSYNDGDYEGARRLGRN 

AKWVA1AS1I1GLLIIGISCAVHFTRNA 

3173 

A 

2 

4048 

FRSGGCRRRAWTSRWPQRRRSPESCEAPLSAPL 

WGPQRGLPGREPLRSRSASA1ALRTIGHILALLLR 

LLHLGLGSGGCREDVPPSGRGKKEEKMKKHRRA 

LALVSCLFLCSLVWLPSWRVCCKESSSASASSYY 

SQDDNCALENEDVQFQKKDEREGPINAESLGKS 

GSNLPISPKEHKLKDDSIVDVQNTESKKLSPPVVE 

TLPTVDLHEESSNAVVDSETVENISSSSTSEITPIS 

KLDEIEKSGTIPIAKPSETEQSETDCDVGEALDAS 

AP1EQPSFVSPPDSLVGQHIENVSSSHGKGKITKSE 

FESKVSASEQGGGDPKSALNASDNLKNESSDYT 

KPGDIDPTSVASPKDPEDIPTFDEWKKKVMEVEK 

EKSQSMHASSNGGSHATKKVQKNRNNYASVEC 

GAKILA ANPE AKSTSAIL1ENMDLYMLNPCSTKI 

WFVIELCEP1QVKQLDIANYELFSSTPKDFLVSISD 

RYPTNKWIKLGTFHGRDERNVQSFPLDEQMYAK 

YVKMFIKYIKVELLSHFGSEHFCPLSLLRVFGTSM 

VEEYEEIADSQYHSERQELFDEDYDYPLDYNTGE 

DKSSKNLLGSATNAILNMVNIAANILGAKTEDLT 

EGNKSISENATATAAPKMPESTPVSTPVPSPEYVT 

TEVHTHDMEPSTPDTPKESPIVQLVQEEEEEASPS 

TVTLLGSGEQEDESSPWFESETQIFCSELTTICCIS 

SFSEYIYKWCSVRVALYRQRSRTALSKGKDYLV 

LAQPPLLLPAESVDVSVLQPLSGELENTNDEREAE 

TVVLGDLSSSMHQDDLVNHTVDAVELEPSHSQT 

LSQSLLLDITPEHSIPLPKIEVSESVEYEAGHIPSPVI 

PQESSVEIDNETEQKSESFSSIEKPSITYETNKVNE 

LMDNIIKEDVNSMQIFTKJ.SETIVPPINTATVPDN 

EDGEAKMNIADTAKQTLISVVDSSSLPEVKEEEQ 

SPEDALLRGLQRTATDFYAELQNSTDLGYANGN 

LVHGSNQKESVFMRLNNRIKALEVNMSLSGRYL 

EELSQRYRKQMEEMQKAFNKTIVKLQNTSRIAE 

EQDQRQTEA1QLLQAQLTNMTQLVSNLSATVAE 

LKREVSDRQSYLVISLVLCVVLGLMLCMQRCRN 

TSQFDGDY1SKLPKSNQYPSPKRCFSSYDDMNLK 

RRTSFPLMRSKSLQLTGKEVDPNDLYIVEPLKFSP 

EKJCKKRCKYKIEKIETIKPEEPLHPIANGDIKGRK 

PFTNQRDFSNMGEVYHSSYKGPPSEGSSETSSQS 

EESYFCGISACTSLCNGQSQKTKTEKRALKRRRS 

KVQDQGKLIKTLIQTKSGSLPSLHDIIKGNKEITV 

GTFGVTAVSGHI 

3174 

A 

485 

4668 

RKCSKEKASKTPSQK1PTTPCCVLQAGPEPRSLAE 
RMGADGETVVLKNMLIGVNLILLGSMIKPSECQL 
EVTTERVQRQSVEEEGGIANYNTSSKEQPVVFNH 
VYNINVPLDNLCSSGLEASAEQEVSAEDETLAEY 
MGQTSDHESQVTFTHRINFPKKACPCASSAQVLQ 
ELLSRIEMLEREVSVLRDQCNANCCQESAATGQL 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny (alanine, G=Glycine, H=Histidine, 
I-Jsoleucine, K=Lysine, L=Leucine, M=iVlcthionine, 
N=Asparagine, P=Proline, Q=G!utaminc, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





DYIPHCSGHGNFSFESCGC1CNEGWFGKNCSEPY 

CPLGCSSRGVCVDGQCICDSEYSGDDCSELRCPT 

DCSSRGLCVDGECVCEEPYTGEDCRELRCPGDCS 

GKGRCANGTCLCEEGYVGEDCGQRQCLNACSG 

RGQCEEGLCVCEEGYQGPDCSAVAPPEDLRVAG 

ISDRSIELEWDGPMAVTEYVISYQPTALGGLQLQ 

QRVPGDWSGVTITELEPGLTYNISVYAVISNILSL 

PITAKVATHLSTPQGLQFKTITETTVEVQWEPFSF 

SFDGWEISFIPKNNEGGVIAQVPSDVTSFNQTGLK 

PGEEYIVNVVALKEQARSPPTSASVSTVIDGPTQI 

LVRDVSDTVAFVEWIPPRAKVDFILLKYGLVGGE 

GGRTTFRLQPPLSQYSVQALRPGSRYEVSVSAVR 

GTNESDSATTQFTTEIDAPKNLRVGSRTATSLDL 

EWDNSEAEVQEYKVVYITLAGEQYHEVLVPRGI 

GPTTRATLTDLVPGTEYGVGISAVMNSQQSVPAT 

MNARTELDSPRDLMVTASSETSISLIWTKASGP1D 

HYR1TFTPSSGIASEVTVPKDRTSYTLTDLEPGAE 

YIISVTAERGRQQSLESTVDAFTGFRPISHLHFSH 

VTSSSVNITWSDPSPPADRLILNYSPRDEEEEMME 

VSLDATKRHAVLMGLQPATEYIVNLVAVHGTVT 

SEPIVGSITTGIDPPKDITISNVTKDSVMVSWSPPV 

ASFDYYRVSYRPTQVGRLDSSVVPNTVTEFTITR 

LNPATEYEISLNSVRGREESER1CTLVHTAMDNP 

VDLIATNITPTEALLQWKAPVGEVENYVIVLTHF 

AVAGETILVDGVSEEFRLVDLLPSTHYTATMYAT 

NGPLTSGTISTNFSTLLDPPANLTASEVTRQSALIS 

WQPPRAEIENYVLTYKSTDGSRKEL1VDAEDTWI 

RLEGLLENTDYTVLLQAAQDTTWSSITSTAFTTG 

GRVFPHPQDCAQHLMNGDTLSGVYPIFLNGELS 

QKLQVYCDMTTDGGGWIVFQRRQNGQTDFFRK 

w A n yr voFfiM vphppwi ax nwrt-rorpsnnR VPr 

vv s\u i iv V vjr vJiN V CJLvCr W i^VJJul^lN Jfllxi 1 ov^OlA. I CL, 

RVDMRDGQEAAFASYDRFSVEDSRNLYKLRIGS 

YNGTAGDSLSYHQGRPFSTEDRDNDVAVTNCA 

MSYKGAWWYKNCHRTNLNGKYGESRHSQGIN 

WYHWKGHEFSIPFVEMKMRPYNHRLMAGRKRQ 

SLQF 

3175 

A " 

2 

623 

RLQLPACPALSAAHPLALPSFSSQCHRAEARAAA 

A ATA FfiTM A <?fi VTVMnPVTK" VFNmX/TK' VR K^<sT 
J-\J\ l r\-CAJ l lvx/vovj V 1 V iHUCt V Irv V r IN i_/iVJJ\. V iViYo O I 

QEEIKXRKKAVLFCLSDDKRQIIVEEAKQILVGDI 
GDTVEDPYTSFVKLLPLNDCRYALYDATYETBCE 
SKKEDLVFIFWAPESAPLKSKMIYASSKDAIKKK 
FTGIKHEWQVNGLDDIKDRSTLGEKLGGNVVVS 
LEGKPL 

3176 

A 

99 

1567 

PRGCWSSCLDAMFRLNSLSALAELAVGSRWYH 

GGSQPIQIRRRLMMVAFLGASAVTASTGLLWKR 

AHAESPPCVDNLKSDIGDKGKNKDEGDVCNHEK 

KTADLAPHPEEKKKKRSGFRDRKVMEYENRIRA 

YSTPDKIFRYFATLKVISEPGEAEVFMTPEDFVRS 

ITPNEKQPEHLGLDQYIIKRFDGKTEKISQEREKF 

ADEGSIFYTLGECGLISFSDYIFLTTVLSTPQRNFE 

IAFKMFDLNGDGEVDMEEFEQVQSIIRSQTSMG 

MRHRDRPTTGNTLKSGLCSALTTYFFGADLKGK 

LTIKNFLEFQRKLQHDVLKLEFERHDPVDGRITE 

RQFGGMLLAYSGVQSKXLTAMQRQLKKHFKEG 

KGLTFQEVENFFTFLKNINDVDTALSFYHMAGAS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny!alanine, G=Glycine, H-Histidine, 
I^lsoleucine, K=Lysine, L— Leucine, Al=Methionine T 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Vafine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\ = possible nucleotide insertion 





LDKVTMQQVARTVAKVELSDHVCDWFALFDC 
DGNGELSNKEFVSJMKQRLMRGLEKPKDMGFTR 
LMQAMWKCAQETAWDFALPKQ 

3177 

A 

182 

648 

LGVVGSGAAVGGRQAARGAALGRRPMAAVLG 
ALGATRRLLAALRGQSLGLAAMSSGTHRLTAEE 
RNQAILDLKAAGWSELSERDAIYKEFSFHNFNQA 
FGFMSRVALQAEKMNHHPEWFNVYNKVQITLTS 
HDCGELTKKD VKLAKFIEKA AA S V 

3178 

A 

8 

612 

ACGCRSFCGSTVMSLLLYYALPALGSYAMLSIFF 

LRRPHLLHTPRAPTFRIRLGAHRGGSGELLENTM 

EAMENSMAQRSDLLELDCQLTRDRVVVVSHDE 

NLCRQSGLNRDVGSLDFEDLPLYKEKLEVYFSPG 

HFAHGSDRRMVRLEDLFQRFPRTPMSVEIKGKN 

EELIREIAGLVRRYDRNEITIWASEKSSVMKKCK 

3179 

A 

88 

1496 

QETSKMETLSFPRYNVAEIVIHIRNKILTGADGKN 

LTKNDLYPNPKPEVLHMIYMRALQIVYGIRLEHF 

YMMPVNSEVMYPHLMEGFLPFSNLVTHLDSFLPI 

CRVNDFETADILCPKAKRTSRFLSGIINFIHFREAC 

RETYMEFLWQYKSSADKMQQLNAAHQEALMK 

LERLDSVPVEEQEEFKQLSDGIQELQQSLNQDFH 

QKTIVLQEGNSQKKSNISEKTKRLNELKLSVVSL 

KEIQESLKTKIVDSPEKLKNYKEKMKDTVQKLK 

NARQEWEKYEIYGDSVDCLPSCQLEVQLYQKK 

IQDLSDNREKLASILKESLNLEDQIESDESELKKL 

KTEENSFKJU^MIVKKEKLATAQFKrNKKHEDVK 

QYKRTVIEDCNKVQEKRGAVYERVTTINHEIQKI 

RLGIQQLKDAADREKLKSQEIFLNLKTALEKYHD 

GIEKAAEDSYAKIDEKTAELKRKMFKMST 

3180 

A 

298 

7086 

GNMACWPQLRLLLWKNLTFRRRQTCQLLLEVA 

WPLFIFLILISVRLSYPPYEQHECHFPNKAMPSAG 

TLPWVQGIICNANNPCFRYPTPGEAPGVVGNFNK 

SIVARJLFSDARRLLLYSQKDTSMKDMRKVLRTL 

QQIKKSSSNLKLQDFLVDNETFSGFLYHNLSLPK 

STVDKMLRADVILHKVFLQGYQLHLTSLCNGSK 

SEEMIQLGDQEVSELCGLPREKLAAAERVLRSN 

MDILKPILRTLNSTSPFPSKELAEATKTLLHSLGT 

LAQELFSMRSWSDMRQEVMFLTNVNSSSSSTQI 

YQAVSRIVCGHPEGGGLKIKSLNWYEDNNYKAL 

FGGNGTEEDAETFYDNSTTPYCNDLMKNLESSPL 

SRnWKALKPLLVGKILYTPDTPATRQVMAEVNK 

TFQELAVFHDLEGMWEELSPKIWTFMENSQEMD 

LVRMLLDSRDNDHFWEQQLDGLDWTAQDIVAF 

LAKHPED VQS SNGSV YTWREAFNETNQAIRTISR 

FMECVNLNKLEPIATEVWLINKSMELLDERKFW 

AGIVFTGITPGSIELPHHVKYKIRMGIDNVERTNK 

IKDGYWDPGPRADPFEDMRYVWGGFAYLQDVV 

EQAIIRVLTGTEKKTGVYMQQMPYPCYVDDIFLR 

VMSRSMPLFMTLAWIYSVAVIIKGIVYEKEARLK 

ETMRIMGLDNSILWFSWnSSLIPLLVSAGLLVVI 

LKLGNLLPYSDPSWFVFLSVFAVVTILQCFLIST 

LFSRANLAAACGGIIYFTLYLPYVLCVAWQDYV 

GFTLKIFASLLSPVAFGFGCEYFALFEEQGIGVQW 

DNLFESPVEEDGFNLTTSVSMMLFDTFLYGVMT 

WYIEAVFPGQYGIPRPWYFPCTKSYWFGEESDEK 

SHPGSNQKRJSEICMEEEPTHLKLGVSIQNLVKVY 
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SEQrD 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, HNHistidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Clutamine, R=Arginine, S=Serine, 
T-Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 





RDGMKVAVDGLALNFYEGQITSFLGHNGAGKTT 

TMSILTGLFPPTSGTAYILGKDIRSEMSTIRQNLG 

VCPQHNVLFDMLTVEEHIWFYARLKGLSEKHVK 

AEMEQMALDVGLPSSKLKSKTSQLSGGMQRKLS 

VALAFVGGSKVVILDEPTAGVDPYSRRGIWELLL 

KYRQGRT1ILSTHHMDEADVLGDRIAIISHGKLCC 

VGSSLFLKNQLGTGYYLTLVKKDVESSLSSCRNS 

SSTVSYLKKEDSVSQSSSDAGLGSDHESDTLTID 

VSAISNLIRKHVSEARLVEDIGHELTYVLPYEAA 

KEGAFVELFHEIDDRLSDLGISSYGISETTLEEIFL 

KVAEESGVDAETSDGTLPARRNRRAFGDKQSCL 

RPFTEDDAADPNDSDIDPESRETDLLSGMDGKGS 

YQVKGWKLTQQQFVALLWKRLLIARRSRKGFF 

AQIVLPAVFVCIALVFSLIVPPFGKYPSLELQPWM 

YNEQYTFVSNDAPEDTGTLELLNALTKDPGFGT 

RCMEGNPIPDTPCQAGEEEWTTAPVPQTIMDLFQ 

NGNWTMQNPSPACQCSSDKIKKMLPVCPPGAGG 

LPPPQRKQNTADILQDLTGRNISDYLVKTYVQIIA 

KSLKNKIWVNEFRYGGFSLGVSNTQALPPSQEV 

NDATKQMKKHLKLAKDSSADRFLNSLGRFMTG 

LDTRNNVKVWFNNKGWHA1SSFLNVINNAILRA 

NLQKGENPSHYGITAFNHPLNLTKQQLSEVAPM 

TTSVDVLVSICVIFAMSFVPASFVVFLIQERVSKA 

KHLQFISGVKPVIYWLSNFVWDMCNYVVPATLV 

IIIFICFQQKSYVSSTNLPVLALLLLLYGWSITPLM 

YPASFVFKJPSTAYWLTSVNLFIGINGSVATFVL 

ELFTDNKLNNIND1LKSVFLIFPHFCLGRGL1DMV 

KNQAMADALERFGENRFVSPLSWDLVGRNLFA 

MAVEGVVFFLITVLIQYRFFIRPRPVNAKLSPLND 

EDEDVRRERQRJLDGGGQNDILEIKELTKIYRRK 

RKPAVDRICVGIPPGECFGLLGVNGAGKSSTFKM 

LTGDTTVTRGDAFLNRNSILSNIHEVHQNMGYCP 

QFDAITELLTGREHVEFFALLRGVPEKEVGKVGE 

WAIRKLGLVKYGEKYAGNYSGGNKRKLSTAMA 

LIGGPPVVFLDEPTTGMDPKARRFLWNCALSVV 

ivbljrKo V V JL 1 oJtloMJc.iiL'rl.AL.L- 1 KJVIAJJYI V INOrvrlvU 

LGSVQHLKNRFGDGYTIVVRIAGSNPDLKPVQDF 

FGLAFPGSVPKEKHRNMLQYQLPSSLSSLARIFSI 

LSQSKKRLHIEDYSVSQTTLDQVFVNFAKDQSDD 

DHLKDLSLHKNQTVVDVAVLTSFLQDEKVKESY 

V 

3181 

A 

215 

1367 

PPATSQAALPEALSKGRETPRPATHPARSQDVRP 

LSCPFDFLRDNVEWSEEQAAAAERKVQENSIQR 

VCQEKQVDYEINAHKYWNDFYK1HENGFFKDR 

HWLFTEFPELAPSQNQNHLKDWFLENKSEVPEC 

RNNEDGPGLIMEEQHKCSSKSLEHKTQTPPVEEN 

VTQKISDLEICADEFPGSSATYRILEVGCGVGNTV 

r r LLl^ I IN IN \Jr\jL.r V I t-LUr oo 1 AlHL V 1 IN oil I Ur 

SRCFAFVHDLCDEEKSYPVPKGSLDI1ILIFVLSAI 

AQLRFKKGQCLSGNFYVRGDGTRVYFFTQEELD 
TLFTTAGLEKVQNLVDRRLQVNRGKQLTMYRV 
WIQCKYCKPLLSSTS 

3182 

A 

3 

1289 

GSETQHLPRDPQHLPWDPQQHQDRRRPELFHAF 
ARDSAPPPSMVLAAETTSQQERLQAIAEKRKRQ 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine > G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosine, 
X=LInknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





AEfENKRRQLEDERRQLQHLKSKALRERWLLEG 

TPSSASEGDEDLRRQMQDDEQKTRLLEDSVSRLE 

KGIEVLERGDSAPAAAKENAAAPSPVRAPAPSPA 

KEERKTEVVMNSQQTPVGTPKDKRVSNTPLRTV 

DG SPMMKA AM YS VEIT VEKDK VTGETRVLSSTT 

LLPRQPLPLGIKVYEDETKVVHAVDGTAENGIHP 

LSSSEVDELIHKADEVTLSEAGSTAGAAETRGAV 

EGAARTTPSRREITGVQAQPGEATSGPPGIQPGQE 

PPVTMIFMGYQNVEDEAETKXVLGLQDTITAEL 

VVIEDAAEPKEPAPPNGSAAEPPTEAASREENQA 

GPEATTSDPQDLDMKKHRCKCCSIM 

3183 

A 

333 

1931 

IAPTGGSHSEIQKQLGSGGDSSSQRRAERRTEPRS 

APRPRWGRSARSPGAHKLPGPPRRRDPGAWARL 

EAAAAHRHSRGSMGRRMRGAAATAGLWLLAL 

GSLLALWGGLLPPRTELPASRPPEDRLPRRPARS 

GGPAPAPRFPLPPPLAWDARGGSLKTFRALLTLA 

AGADGPPRQSRSEPRWHVSARQPRPEESAAVHG 

GVFWSRGLEEQVPPGFSEAQAAAWLEAARGAR 

MVALERGGCGRSSNRLARFADGTRACVRYGINP 

EQIQGEALSYYLARLLGLQRHVPPLALARVEAR 

GAQWAQVQEELRAAHWTEGSVVSLTRWLPNLT 

DVVVPAPWRSEDGRLRPLRDAGGELANLSQAEL 

VDLVQWTDLILFDYLTANFDRLVSNLFSLQWDP 

RVMQRATSNLHRGPGGALVFLDNEAGLVHGYR 

VAGMWDKYNEPLLQSVCVFRERTARRVLELHR . 

GQDAAARJLLRLYRRHEPRFPELAALADPHAQLL 

QRRLDFLAKHILHCKAKYGRRSGDLVSPGGKER 

DLGLGYG 

3184 

A 

1 

1004 

GSTHASADAWAQWFCTEALVMGAPVWYLVAA 

ALLVGFILFLTRSRGRAASAGQEPLHNEELAGAG 

RVAQPGPLEPEEPRAGGRPRRRRDLGSRLQAQR 

RAQRVAWAEADENEEEAVILAQEEEGVEKPAET 

HLSGKIGAKKLRKLEEKQARKAQREAEEAEREE 

RKRLESQREAEWKKEEERLRLEEEQKEEEERKA 

REEQAQREHEEYLKLKEAFVVEEEGVGETMTEE 

QSQSFLTEFINYIKQSKVVLLEDLASQVGLRTQD 

TINRIQDLLAEGTITGVIDDRGKFIYITPEELAAVA 

NFIRQRGRVSIAELAQASNSLIAWGRESPAQAPA 

3185 

A 

2981 

7173 

CLLAGKFSSTLYETGGCDMSLVNFEPAARRASNI 

CDTDSHVSSSTSVRFYPHDVLSLPQIRLNRLLTED 

TDLLEQQDIDLSPDLAATYGPTEEAAQKVKHYY 

RFWILPQLWIGINFDRLTLLALFDRNREILENVLA 

VILAILVAFLGSILLIQGFFRDIWVFQFCLVIASCQ 

YSLLKSVQPDSSSPRHGHNRIIAYSRPVYFCICCG 

LrWLLDYGSRNLTATKFKLYGITFTNPLVFISARD 

LVIVFTLCFPIVFFIGLLPQVNTFVMYLCEQLDHI 

FGGNATTSLLAALYSFICSIVAVALLYGLCYGAL 

KDSWDGQHIPVLFSIFCGLLVAVSYHLSRQSSDP 

SVLFSLVQSKIFPKTEEKNPEDPLSEVKDPLPEKL 

RNSVSERLQSDLWCIVIGVLYFAIHVSTVFTVLQ 

PALKYVLYTLVGFVGFVTHYVLPQVRKQLPWH 

CFSHPLLKTLEYNQYEVRNAATMMWFEKLHVW 

LLr^EKNIIYPLIVLNELSSSAETIASPKKLNTELG 

ALMITVAGLKLLRSSFSSPTYQYVTVEFTVLFFKF 

DYE AFSETMLLDLFFMSILFNKL WELL YKLQFVY 
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SEQ n> 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteinc, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline,Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





TY1APWQITWGSAFHAFAQPFAVPHSAMLF1QAA 

VSAFFSTPLNPFLGSAIFITSYVRPVKFWERDYNT 

KRVDHSNTRLASQLDRNPGTYCQQREVEAITEG 

VEEDEGFCCCEPGH1PHMLSFNAAFSQRWLAWE 

VIVTKYILEGYSITDNSAASMLQVFDLRKVLTTY 

YVKGIIYYVTTSSKLEEWLANETMQEGLRLCAD 

RNYVDVDPTFNPN1DEDYDHRLAGISRESFCVIY 

LNWIEYCSSRRAKPVDVDKDSSLVTLCYGLCVL 

GRRALGTASHHMSSNLESFLYGLHALFKGDFRIS 

SIRDEWIFADMELLRKVVVPG1RMSIKLHQDHFT 

SPDEYDDPTVLYEAIVSHEKNLVIAHEGDPAWRS 

AVLANSPSLLALRHVMDDGTNEYKIIMLNRRYL 

SFRVIKVNKECVRGLWAGQQQELVFLRNRNPER 

GSIQNAKQALRNMINSSCDQPIGYPIFVSPLTTSY 

SDSHEQLKDILGGPISLGNIRNFIVSTWHRLRKGC 

GAGCNSGGNIEDSDTGGGTSCTGNNATTANNPH 

SNVTQGSIGNPGQGSGTGLHPPVTSYPPTLGTSHS 

SHSVQSGLVRQSPARASVASQSSYCYSSRHSSLR 

MSTTGFVPCRRSS rSQlSLKJsLrbbiQbKJ^bMVN^ 

MEPSGQSGLACVQHGLPSSSSSSQSIPACKHHTL 

VGFLATEGGQSSATDAQPGMTLSPANNSHSRKA 

EVIYRVQIVDPSQILEGINLSKRKELQWPDEG1RL 

KAGRNSWKDWSPQEGMEGHVIHRWVPCSRDPG 

TRSH1DKAVLLVQ1DDKY V 1 VlblGVLfclAjAfcv 

3186 

A 

3 

470 

SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 
DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 
IQSKSSKAVVHGILMGVPVPFPIPEPDGCKSGrNC 
PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLQDD 
KJn yoLr C Wlllr V vjl V orlJL 

3187 

A 

3 

470 

SLSAMRFLAATFLLLALSTAAQAEPVQFKDCGSV 

DGVIKEVNVSPCPTQPCQLSKGQSYSVNVTFTSN 

IQSKSSKAWHGILMGVPVPFPIPEPDGCKSGINC 

PIQKDKTYSYLNKLPVKSEYPSIKLVVEWQLQDD 

KNQSLFCWEIPVQIVSHL 

3188 

A 

2 

3483 

PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEM 

EEMIEQLQEKVHELEKQNDTLKKRLISAKQQLQT 

QGYRQTPYNNVQSRINTGRRKANENAGLQECPR 

KGIKFQDADVAETPHPMFTKYGNSLLEEARGEIR 

NLENVIQSQRGQIEELEHLAEILKTQLRRKENEIE 

LSLLQLREQQATDQRSNIRDNVEMIKLHKQLVE 

KSNALSAMEGKFIQLQEKQRTLKISHDALMANG 

DELNMQLKEQRLKCCSLEKQLHSMKFSERRIEEL 

QDRINDLEKERELLKENYDKLYDSAFSAAHEEQ 

WKLKEQQLKVQIAQLETALKSDLTDKTEELDRL 

KTERDQNEKLVQENRELQLQYLEQKQQLDELKK 

RIKLYNQENDINADELSEALLLIKAQKEQKNGDL 

SFLVKVDSEINKDLERSMRELQATHAETVQELEK 

TRNMLMQHKINKI)YQMEVEAVTRKMENLQQD 

YELKVEQYVHLLDIRAARIHKLEAQLKDIAYGTK 

OVTf FKPFTMPnnwnFFnFTTHT FRGENLFEIrllN 

VjT rvri\.riZiHViruL/j v L/ijr l^j ljj in ii_j.L/1\-VJ * i_/a l^xx xxi ^ 

KVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTP 
WRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITL 
EVHQAYSTEYETIAACQLKFHEILEKSGRIFCTAS 
LIGTKGDIPNFGTVEYWFRLRVPMDQAIRLYRER 
AKALGYITSNFKGPEHMQSLSQQAPKTAQLSSTD 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyJalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





STDGNLNELHITIRCCNHLQSRASHLQPHPYVVY 

KFFDFADHDTAHPSSNDPQFDDHMYFPVPMNM 

DLDRYLKSESLSFYVFDDSDTQENIYIGKVNVPLI 

SLArTORCISGIFELTDHQKHPAGTIHVlLKWKFA 

YLPPSGSITTEDLGNFIRSEEPEVVQRLPPASSVST 

LVLAPRPKPRQRLTPVDKKVSFVDIMPHQSDVSQ 

EGSVDEVKENTEKMQQGKDDVSLLSEGQLAEQS 

LASSEDETEITEDLEPEVEEDMSASDSDDCIIPGPI 

SKMKQPSEKIRIEIIALSLNDSQVTMDDTIQRLFV 

ECRFYSLPAEETPVSLPKPKSGQWVYYNYSNVIY 

VDKENNKAKRDILKAILQKQEMPNRSLRFTVVS 

DPPEDEQDLECEDIGVAHVDLADMFQEGRDL1E 

QNIDVFDARADGEGIGKLRVTVEALHALQSVYK 

QYRDDLEA 

3189 

A 

476 

1175 

MKGSG WHLRSGMVGTLITTILPH WRRTAH VGTN 

ILTAVSYLKGLWMECVWHSTGIYOCOIYRSLLA 

LPQDLQAARALMGISCLLSGIACACAVIGMKCTR 

CAKGTPAKTTFAILGGTLFILAGLLCMGAVSWTT 

NDVVQNFYNPLLPSGMKFEIGQALYLGFISSSLSL 

IGGTLLCLSCQDEAPYRPYQAPPRATTTTANTAP 

AYQPPAAYKDNRAPSVTSATHSGYRLNDYV 

3190 

A 

267 

1037 

DRMAWQGLVLAACLLMFPSTTADCLSRCSLCA 

VKTQDGPKPINPLICSLQCQAALLPSEEWERCQSF 

LSFFTPSTLGLNDKEDLGSKSVGEGPYSELAJCLS 

GSFLKELEKSKFLPSISTKENTLSKSLEEKLRGLS 

DGFREG AESELMRD AQLNDG A METGTL YL AEE 

DPKEQVKRYGGFLRKYPKRSSEVAGEGDGDSM 

GHEDLYKRYGGFLRRIRPKLKWDNQKRYGGFLR 

RQFKVVTRSQEDPNAYSGELFDA 

3191 

A 

29 

574 

GTSAGAQTKGALCQLKVPTEKLPSPLPTMADEID 

FTTGDAGASSTYPMQCSALRKNGFVVLKGRPCK 

IVEMSTSKTGKHGHAKVHLVGIDIFTGKKYEDIC 

PSTHNA1DVPNIKRNDYQLICIQDGYLSLLTETGE 

VREDLKLPEGELGKEIEGKYNAGEDVQVSVMCA 

MSEEYAVAIKPCK 

3192 

A 

105 

1661 

KVSADGMQSCESSGDSADDPLSRGLRRRGQPRV 

VVIGAGLAGLAAAKALLEQGFTDVTVLEASSHIG 

GRVQSVKLGHATFELGATW1HGSHGNPIYHLTE 

ANGLLEETTDGERSVGRISLYSKNGVACYLTNH 

GRRIPKDVVEEFSDLYNEVYNLTQEFFRHDKPVN 

AESQNSVGWTREEVRNRIRNDPDDPEATICRLKL 

AMIQQYLKVESCESSSHSMDEVSLSAFGEWTEIP 

GAHHIIPSGFMRVVELLAEGIPAHVIQLGKPVRCI 

HWDQASARPRGPEIEPRGEGDHNHDTGEGGQGG 

EEPRGGRWDEDEQWSWVECEDCELIPADHV1V 

TVSLGVLKRQYTSFFRPGLPTEKVAAIHRLGIGTT 

DKJDFLEFEEPFWGPECNSLQFVWEDEAESHTLTY 

PPELWYRKICGFDVLYPPERYGHVLSGWICGEEA 

LVMEKCDDEAVAEICTEMLRQFTGNPNIPKPRRI 

LRSAWGSNPYFRGSYSYTQVGSSGADVEKLAKP 

LPYTESSKTATK 

3193 

A 

1 

1928 

QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQT 
ANLSVVFKDSNSTTPLIFVLSPGTDPAADLYKFA 
EEMKFSKKLSAISLGQGQGPRAEAMMRSSIERGK 
WVrTQNCHLAPSWMPALERLIEHINPDKVHRDF 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to fast amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G1utamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





RLWLTSLPSNKFPVSILQNGSKMTIEPPRGVRAN 

LLKSYSSLGEDFLNSCHKVMEFKSLLLSLCLFHG 

NALERRKFGPLGFNIPYEFTDGDLRICISQLKMFL 

DEYDDIPYKVLKYTAGEINYGGRVTDDWDRRCI 

MNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLH 

GYLSYIKSLPLNDMPEIFGLHDNANITFAQNETFA 

LLGTIIQLQPKSSSAGSQGREEIVEDVTQNILLKVP 

EPINLQWVMAKYPVLYEESMNTVLVQEVIRYNR 

LLQVITQTLQDLLKALKGLVVMSSQLELMAASL 

YM^TVPELWSAKAYPSLKPLSSWVMDLLQRLDF 

LQAWIQDGIPAVFWISGFFFPQAFLTGTLQNFAR 

KFVISIDTISFDFKVMFEAPSELTQRPQVGCYIHG 

LFLEGARWDPEAFQLAESQPKELYTEMAVIWLL 

PTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHST 

NYVIAVE1PTHQPQRHWIKRGVALICALDY 

3194 

A 

1 

1023 

DGWTPVHAAVDTGNVDSLKLLMYHR1PAHGNS 

FNEEESESSVFDLDGGEESPEGISKPVVPADLINH 

ANREG WT A A HI A A S KGFKNCLEILCRHGGLEPE 

RRDKCNRTVHDVATDDCKHLLENLNALKIPLRIS 

VGEIEPSNYGSDDLECENTICALN1RKQTSWDDFS 

KA VSO A LTNHFO A I SS DG WWSLED VTCNNTTDS 

NIGLSARSIRSITLGNVPWSVGQSFAQSPWDFMR 

KNKAEHITVLLSGPQEGCLSSVTYASMIPLQMM 

QNYLRLVEQYHNVIFHGPEGSLQDYIVHQLALCL 

KHRQMGWQDSPVEIVEELEVGCWFFPREQLLRT 

CSLVA 

3195 

A 

1 

1809 

MAASAQVSVTFEDVAVTFTQEEWGQLDAAQRT 

LYQEVMLETCGLLMSLGCPLFKPELIYQLDHRQE 

LWMATKDLSQSSYPGDNTKPKTTEPTFSHLALPE 

EVLLQEQLTQGASKNSQLGQSKDQDGPSEMQEV 

HLKIGIGPQRGKLLEKMSSERDGLGSDDGVCTKI 

TQKQVSTEGDLYECDSHGPVTDALIREEKNSYK 

CEECGKVFKKNALLVQHERIHTQVKPYECTECG 

KTFSKSTHLLQHLIIHTGEKPYKCMECGKAFNRR 

SHLTRHQRJHSGEKPYKCSECGKAFTHRSTFVLH 

HRSHTGEKPFVCKECGKAFRDRPGFIRHYIIHTGE 

KPYECIECIECGKAFNRRSYLTWHQQIHTGVKPF 

ECNECGKAFCESADLIQHY1IHTGEKPYKCMECG 

KAFT4RJRSHLKQHQRIHTGEKPYECSECGKAFTH 

CSTFVLHKRTHTGEKPYECKECGKAFSDRADLIR 

HFSIHTGEKPYECVECGKAFNRSSHLTRHQQ1HT 

GEKPYECIQCGKAFCRSANLIRHSIIHTGEKPYEC 

SECGKAFNRGSSLTHHQRIHTGRNPTIVTDVGRP 

FMTAQTSVNIQELLLGKEFLNITTEENLW 

3196 

A 

1400 

264 

VGF WERPLRS SRWFRRSLRRWEMLARAARGTG 

ALLLRGSLLASGRAPRRASSGLPRNTVVLFVPQQ 

EAWVVERMGRFHRILEPGLNILIPVLDRIRYVQSL 

KEIVrNVPEQSAVTLDNVTLQIDGVLYLRIMDPY 

KASYGVEDPEYAVTQLAQTTMRSELGKLSLDKV 

FRERESLNASIVDAINQAADCWGIRCLRYEIKDIH 

VPPRVKESMQMQVEAERRKRATVLESEGTRESA 

INVAEGKKQAQILASEAEKAEQINQAAGEASAVL 

AKAKAKAEAIRILAAALTQHNGDAAASLTVAEQ 

YVSAFSKLAKDSNTILLPSNPGDVTSMVAQAMG 

VYGALTKAPVPGTPDSLSSGSSRDVQGTDASLDE 
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SEQ ID 
NO: 

Method 

Predicted 
beginning 
nucleotide 

luLJUUU 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 

Predicted end 

nucleotide 

location 

rnrrpcnnnHino 

to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, C=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
i^~*/*op<trj|£Mie, ■ — nuiiiic, v»iut<niiiiic, iv — /\rginine, o — serine, 
T=Thrconine, V= Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possiblc nucleotide insertion 





ELDRVKMS 

3197 

A 

66 

3632 

LWECAAAAAGQRDGGVTLFLKGRVLGRRCAAS 

LFAREVCVSTSSSRPACFLHCARARGEQMHQMA 

SGVGSMKRSPRKMWRPGEKKEPQGVVYEDVRD 

DTEDFKEPLKVVFEGSAYGLQNFNKQKKLKTCD 

DMDTFFLHYAAAEGQIELMEKITRDSSLEVLHE 

MDDYGNTPLHCAVEKNQIESVKFLLSRGANPNL 

RNFNMMAPLHIAVQGMNNEVMKVLLEHRTIDV 

NLEGENGNTAVIIACTTONSEALQILLNKGAKPC 

KSNKWGCFPIHQAAFSGSKECMEIILRFGEEHGY 

SRQLHINFMNNGKATPLHLAVQNGDLEMIKMCL 

DNGAQ1DPVEKGRCTAIHFAATQGATE1VKLMIS 

SYSGSVDIVNTTD(3CHETMLHRASLFDHHELAD 

YLISVGADINKIDSEGRSPL1LATASASWNIVNLL 

LSKGAQVD1KDNFGRNFLHLTVQQPYGLKNLRP 

EFMQMQQIKELVMDEDNDGCTPLHYACRQGGP 

GSVNNLLGFNVSIHSKSKDKKSPLHFAASYGRJN 

TCQRLLQDISDTRLLNEGDLHGMTPLHLAAKNG 

HDKVVQLLLKKGALFLSDHNGWTALHHASMGG 

YTQTMKVILDTNLKCTDRLDEDGNTALHFAARE 

GHAKAVALLLSHNADIVLNKQQASFLHLALHNK 

RKEVVLTIIRSKRWDECLKIFSHNSPGNKCPITEM 

IEYLPECMKVLLDFCMLHSTEDKSCRDYYIEYNF 

KYLQCPLEFTKKTPTQDVIYEPLTALNAMVQNN 

RIELLNHPVCKEYLLMKWLAYGFRAHMMNLGS 

YCLGLIPMT1LVVNIKPGMAFNSTGIINETSDHSEI 

LDTTNSYLIKTCMILVFLSSIFGYCKEAGQIFQQK 

RNYFMDISNVLEWIIYTTGIIFVLPLFVEIPAHLQ 

WQCGAIAVYFYWMNFLLYLQRFENCGIFIVMLE 

VILKTLLRSTVVFIFLLLAFGLSFYILLNLQDPFSS 

PLLSIIQTFSMMLGDrNYRESFLEPYLRNELAHPV 

I SFAOI VSFTTFVP1V1 MNT T IGl AVGDIAFVOKH 

ASLKRIAMQVELHTSLEKKLPLWFLRKVDQKSTI 

VYPNKPRSGGMLFHIFCFLFCTGEIRQEIPNADKS 

LEMEILKQKYRLKDLTFLLEKQHELIKLHQKMEII 

SETEDDDSHCSFQDRFKKEQMEQRNSRWNTVLR 

AVKAKTHHLEP 

3198 

A 

51 

2177 

KEKSLHHVDQRPPLWHPGRPGTSQSAAMNASSE 

GESFAGSVQIPGGTTVLVELTPDIHICGICKQQFN 

NLDAFVAHKQSGCQLTGTSAAAPSTVQFVSEET 

VPATQTQTTTRTITSETQTITVSAPEFVFEHGYQT 

YLPTESNENQTATVISLPAKSRTKKPTTPPAQKRL 

NCCYPGCQFKTAYGMKDMERHLKIHTGDKPHK 

CEVCGKCFSRKDKLKTHMRCHTGVKPYKCKTC 

DYAAADSSSLNKHLR1HSDERPFKCQICPYASRN 

SSQLTVHLRSHTGDAPFQCWLCSAKFKISSDLKR 

HMRVHSGEKPFKCEFCNVRCTMKGNLKSHIRIK 

HSGNNFKCPHCAFLGDSKATLRKHSRVHQSEFIR 

EKCSECSYSCSSKj\ALRIHERIHCTVRPFKCNYCS 

FDSKQPSbn.SKHNOCKFHGDMVKTEALERKDTG 

RQSSRQVAKLDAKKSFHCDICDASFMREDSLRS 

HKRQHSEYNESKNSDVTVLQFQIDPSKQPATPLT 

VGHLQVPLQPSQVPQFSEGRVKIIVGHQVPQANT 

IVQAAAAAVNIVPPALVAQNPEELPGNSRLQILR 

QVSLIAPPQSSRCPSEAGAMTQPAVLLTTHEQTD 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methioninc, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine» S=Sertne, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, ^Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 





GATLHQTLIPTASGGPQEGSGNQTFITSSGITCTD 
FEGLNALIQEGTAEVTVVSDGGQN1AVATTAPPV 
FSSSSQQELPKQTYSIIQGAAHPALLCPADSIPD 

3199 

A 

13 

2247 

QSFHSMEGDPSGLPLLARGASCYSLICPCPRPAD 

WSILQGTDWSILQSADWCIYNPLARHRALTGVFL 

QSADWCTYNPLARQKSSPSPHSTQEVQLASPLTR 

RPNKKDSAERNHRPAREGSVAQRQPNPAALEKA 

EPAARKRNEREGGGSQEPGREHSLEKGYWAPGL 

GPDPSMCSKQVDPSEGASSHLKHRGGSRAAHLE 

VRRLLRRLVGALVAEAGFCYVQVAEGQRVVGV 

LEVAEAAAAPVQHEPTAAVATQSRWFPRGTRPG 

LCSLPLAVAALLCPGSGPGAQSGLEFVERPPPSPL 

AVVLARWPLPPPAGRCPRDAPEARVPEKARAEG 

SERENNYGCGVVGGEMTTLVLDNGAYNAKIGY 

SHENVSVIPNCQFRSKTARLKTFTANQIDEJKDPS 

GLFYILPFQKGYLVNWDVQRQVWDYLFGKEMY 

QVDFLDTNIIITEPYFNFTSIQESMNEILFEEYQFQ 

AVLRVNAGALSAHRYFRDNPSELCC11VDSGYSF 

THIVPYCRSKKKKEAI1RINVGGKLLTNHLKEIISY 

RQLHVMDETHVINQVKEDVCYVSQDFYRDMDI 

AKIXGEENTVMIDYVLPDFSTIKKGFCKPREEMV 

LSGKYKSGEQILRLANERFAVPEILFNPSD1GIQE 

MGIPEATVYSIQNLPEEMQPHFFKNIVLTGGNSLF 

PGFRJDRVYSEVRCLTPTDY DVoV VLPbNPl 1 YA W 

EGGKLISENDDFEDMVVTREDYEENGHSVCEEK 

FDI 

3200 

A 

3 

307 

AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIWKTR 
SCAGISGKSQLLFALVFTTRYLDLFTSFISLYMTS 
MKVWYA1HRN VrHLQC I uLW 1 LNLCC^LUlrN 

3201 

A 

1 

469 

IRHEGRGQRGKMELVQVLKRGLQQITGHGGLRG 

YLRVFFRTNDAKVGTLVGEDKYGNKYYEDNKQ 

FFGRHRWVVYTTEMNGKNTFWDVDGSMVPPE 

WHRWLHSMTDDPPTTKPLTARKFIWTNHKFNVT 

GTPEQYVPYSTTRKKIQEWIPPSTPYK 

3202 

A 

144 

840 

NSSQRIMATHALEIAGLFLGGVGMVGTVAVTVM 
PQWRVSAFIENNlVVrhNr WhuLWMNCVKC^ANl 
RMQCK1YDSLLALSPDLQAARGLMCAASVMSFL 
AFMMAILGMKCTRCTGDNEKVKAHILLTAGIIFII 
TGMVVLIPVSWVANAIIRDFYNSIVNVAQKRELG 
EALYLGWTTALVLIVGGALFCCVFCCNEKSSSYR 

^/^TT>QTJT>T^rVI/^VTJT7^k r fc r QDQ\/VQP CAVA/ 

YoIrorLKl 1 ylvoY ri I OisJvo.ro V Y oivoyi V 

3203 

A 

2 

473 

KYRYRRPYPVMRKICQVGPAGLAF1LNISPVAHR 
VALCHLAGCQEQAAWYHTLQILFFLVSAYFFSCP 
VPEKYFPGSCDIVGHGHQIFHAFLSICTLSQLEAIL 
LDYQGRQEIFLQRHGPLSVHMACLSFFFLAACSA 
ATAALLRHKVKARLTKKDS 

3204 

A 

1808 

668 

PESAPLPAFISSRILPAAWRNWCSYVVTRTISCHV 
QNGTYLQRVLQNCPWPMSCPGSSYRTWRPTYK 
VMYKJVTAREWRCCPGHSRVSCEEVAGSSASLE 

r JVl WoLjo 1 1 Virvlvl vi/\ L ivr 1 /\.rovji^J-/iNv^ofv v ojcjl. I civ 

LKVLEAKMTMLTVIEQPVPPTPATPEDPAPLWGP 

PPAQGSPGDGGLQDQVGAWGLPGPTGPKGDAG 

SRGPMGMRGPPGDPLLSNTFTETNNHWPQGPTG 

PPGPPGPMGPPGPPGPTGVPGSPGHIGPPGPTGPK 

GISGHPGEKGERGLRGEPGPQGSAGQRGEPGPKG 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
lucjiiun 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 

1 — I^UlCUllIIc, IV — LV^lUC, l/ M LCuUllC) lVI — iVIClJllOnillC) 

N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 





DPGEKSHWGEGLHQLREALKJLAERVLILETM1G 
LYEPELGSGAGPAGTGTPSLLRGKRGGHATNYRI 
VAPRSRDERG 

3205 

A 

2810 

1652 

RTSTQKWQSVFNDSQEHLERFYCNPENDRMRM 

KYGGQEFWADLNAMNVYETTEFDQLRRLSTPPS 

SNVNSIYHTVWKFFCRDHFGWREYPESVIRLIEE 

ANSRGLKEVRFMMWNNHYILHNSFFRREIKRRP 

LFRSCFILLPYLQTLGGVPTQAPPPLEATSSSQIICP 

DGVTSANFYPETWVYMHPSQDFIQVPVSAEDKS 

YRIIYNLFHKTVPEFKYRILQILRVQNQFLWEKY 

KRKXEYMNRKMFGRDRIINERHLFHGTSQDVVD 

GICKHNFDPRVCGKJiATMFGQGSYFAKKASYSH 

NFSKKS SKG VHFMFL AK VLTGRYTMGSHGMRR 

PPPVNPGSVTSDLYDSCVDNFFEPQIFVIFNDDQS 

YPYFVIQYEEVSNTVSI 

3206 

A 

297 

4500 

CLVDSKLWKGARSVYHQLFMSSLLMDLKYKKL 

FAVRFAKNYERLQSDYVTDDHDREFSVADLSVQ 

IFTVPSLARMLITEENLMSIIIKTFMDHLRHRDAQ 

GRFQFERYTALQAFKFRRVQSLILDLKYVLISKPT 

EWSDELRQKFLEGFDAFLELLKCMQGMDPITRQ 

VGQHIEMEPEWEAAFTLQMKLTHVISMMQDWC 

ASDEKVLIEAYKKCLAVLMQCHGGYTDGEQPIT 

LSICGHSVETIRYCVSQEKVSIHLPVSRLLAGLHV 

LLSKSEVAYKFPELLPLSELSPPMLIEHPLRCLVL 

CAQVHAGMWRRNGFSLVNQIYYYHNVKCRRE 

MFDKDVVMLQTGVSMMDPNHFLMIMLSRFELY 

QIFSTPDYGKRFSSEITHKDVVQQNNTLIEEMLYL 

IIMLVGERFSPGVGQVNATDEIKREIIHQLSIKPM 

AHSELVKSLPEDENKETGMESVIEAVAHFKKPGL 

TGRGMYELKPECAKEFNLYFYHFSRAEQSKAEE 

AQRKLKRQNREDTALPPPVLPPFCPLFASLVNILQ 

SDVMLCIMGTILQWAVEHNGYAWSESMLQRVL 

HLIGMALQEEKQHLENVTEEFIVVTFTFTQKISKP 

GEAPKNSPSILAMLETLQNAPYLEVHKDMIRW1L 

KTFNAVKKMRESSPTSPVAETEGTIMEESSRDKD 

KAERKRKAEIARLRREKIMAQMSEMQRHFIDEN 

KELFQQTLELDASTSAVLDHSPVASDMTLTALGP 

AQTQVPEQRQFVTCILCQEEQEVKVESRAMVLA 

AFVQRSTVLSKNRSKFIQDPEKYDPLFMHPDLSC 

GTHTSSCGHIMHAHCWQRYFDSVQAKEQRRQQ 

RLRLHTSYDVENGEFLCPLCECLSNTVIPLLLPPR 

NIFRNRLNFSDQPNLTQWIRTISQQrKALQFLRKE 

ESTPNNASTKNSENVDELQLPEGFRPDFRPKIPYS 

ESIKEMLTTFGTATYKVGLKVHPNEEDPRVPIMC 

WGSCAYTIQSIERILSDEDKPLFGPLPCRLDDCLR 

SLTRFAAAHWTVASVSVVQGHFCKPFASLVPND 

SHEELPCDLDIDMFHLLVGLVLAFPALQCQDFSGI 

SLGTGDLHIFHLVTMAHIIQILLTSCTEENGMDQE 

NPPCEEESAVLALYKTLHQYTGSALKEIPSGWHL 

WRSVRAGIMPFLKCSALFFHYLNGVPSPPDIQVP 

GTSHFEHLCSYLSLPNNLICLFQENSEIMNSLIES 

WCRNSEVKRYLEGERDAIRYPRESNKLINLPEDY 

SSLINQASNFSCPKSGGDKSRAPTLCLVCGSLLCS 

QSYCCQTELEGEDVGACTAHTYSCGSGVGIFLR 

VRECQVLFLAGKTKGCFYSPPYLDDYGETDQGL 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
iocation 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=lsoteucine, K-Lysine, L^Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknovvn, *-Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





RRGNPLHLCKERFKK1QKLWHQHSVTEEIGHAQ 
EANQTLVGIDWQHL 

3207 

A 

49 

963 

QLSPSQAPAGAQEVARRVTVGSASHGGRRSTMA 

TTVSTQRGPVYIGELPQDFLRITPTQQQRQVQLD 

AQAAQQLQYGGAVGTVGRLMTWQAKLAKNY 

GMTRMDPYCRLRLGYAVYETPTAHNGAKNPRW 

NKVIHCTVPPGVDSFYLEIFDERAFSMDDRIAWT 

H1T1PESLRQGKVEDKWYSLSGRQGDDKEGMINL 

VMSYALLPAAMVMPPQPVVLMPTVYQQGVGY 

VPITGMPAVCSPGMVPVALPPAAVNAQPRCSEE 

DLKAIQDMFPNMDQEVIRSVLEAQRGNKDAAIN 

SLLQMGEEP 

3208 

A 

54 

1196 

LERTPASADMAWTKYQLFLAGLMLVTGSINTLS 

AKWADNFMAEGCGGSKEHSFQHPFLQAVGMFL 

GEFSCLAAFYLLRCRAAGQSDSSVDPQQPFNPLL 

FLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 

VIIFTGLFSVAFLGRRLVLSQWLGILATIAGLVVV 

GLADLLSKHDSQHKLSEVITGDLLIIMAQIIVAIQ 

MVLEEKFVYKHNVHPLRAVGTEGLFGFVILSLLL 

VPMYYrPAGSFSGNPRGTLEDALDAFCQVGQQP 

LIAVALLGN1SSIAFFNFAGISVTKELSATTRMVL 

DSLRTVVIWALSLALGWEAFHALQILGFLILLIGT 

ALYNGLHRPLLGRLSRGRPLAEESEQERLLGGTR 

TPINDAS 

3209 

A 

104 

1999 

akvvslkefscfwrrekpvsslsslqvkaeasw 

dsavhgcpqlsrgtpvderlflivrvtvqlshpa 

dmqlvlrkricvnvhgrqgfaqsllkkmshrss 

ipgcgvtfeivsnipedaqgveerealarmaanv 

enpasadseayiekylrsvlavenlltldrlrqe 

vavkeqltgkgklsrrsisspnvnrlsgsrqdlip 

syslgsnkgrwesqqdvsqttvsrgiapapalsv 

spqnnhspdpglsnlaasylnpvksfvpqmpkll 

kslfpvrdekrgkrpsplahqpvprimvqsaspdi 

rvtrmeeaqpemgpdvlvq™gapalkicdkp 

akvpspppviavtavtpapeaqdgppsplseassg 

yfshsvstatlsdalgpgldaaappgsmptapea 

epeapishpppptavpaeeppgpqqlvspgrerpdl 

eapapgspfrvrrvraselrsfsrmlagdpgcsp 

gaegnapapgaggqalasdseeadevpewlreg 

efvtvgahktgvvryvgpadfqegtwvgveld 

lpsgkndgsiggkqyfrcnpgygllvrpsrvrr 

atgpvrrrstglrlgapearrsatlsgsatnlas 

ltaalakadrshknpenrkswas 

3210 

A 

324 

694 

spfwtekrrmekplfplvplhwfgfgytalvvs 
ggivgyvktgsvpslaagllfgslaglgayqly 
qdprnvwgflaatsvtfvgvmgmrsyyygkf 
mpvgliagasllmaakvgvrmlmtsd 

3211 

A 

1078 

594 

vgmelpavnlkvillghwllttwgcivfsgsya 
wanftilalgvwavaqrdsidaismflggllati 
fldivhisifyprvsltdtgrfgvgmailslllkpl 
sccfvyhmyrerggellvhtgflgssqdrsayq 
tidsaeapadpfavpegrsqdargy 

3212 

A 

1 

1962 

frcglapkgrprrradpvasaimdpaeavlqek 
alkfmmefrswcpgwntmarsrltatstsrvq 
csmprslwlgcssladsmpslrclynpgtgalt 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G^Glycine, H=Histidine, 
I— Isoleucine, K— Lysine, L= Leucine, M = IVIethionine, 
N=Asparagine, P=Proline, Q=Glu famine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, VV^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\~possibIe nucleotide insertion 





AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRJCLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTILIWDFLND 

PAAQSEPPRSPSRTYTYISR 

3213 

A 

1 

1962 

FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLIERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKIIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQKIV 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVI ITGSSDSTVRV WDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLVVSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 

LVEHSGRVFRLQFDEFQIVSSSHDDTELIWDFLND 

PAAQSEPPRSPSRTYTYISR 

3214 

A 

1 

1962 

FRCGLAPKGRPRRRADPVASAIMDPAEAVLQEK 

ALKFMMEFRSWCPGWNTMARSRLTATSTSRVQ 

CSMPRSLWLGCSSLADSMPSLRCLYNPGTGALT 

AFQNSSEREDCNNGEPPRKIIPEKNSLRQTYNSCA 

RLCLNQETVCLASTAMKTENCVAKTKLANGTSS 

MIVPKQRKLSASYEKEKELCVKYFEQWSESDQV 

EFVEHLISQMCHYQHGHINSYLKPMLQRDFITAL 

PARGLDHIAENILSYLDAKSLCAAELVCKEWYR 

VTSDGMLWKKLEERMVRTDSLWRGLAERRGWG 

QYLFKNKPPDGNAPPNSFYRALYPKJIQDIETIES 

NWRCGRHSLQRIHCRSETSKGVYCLQYDDQK1V 

SGLRDNTIKIWDKNTLECKRILTGHTGSVLCLQY 

DERVIITGSSDSTVRVWDVNTGEMLNTLIHHCEA 

VLHLRFNNGMMVTCSKDRSIAVWDMASPTDITL 

RRVLVGHRAAVNVVDFDDKYIVSASGDRTIKV 

WNTSTCEFVRTLNGHKRGIACLQYRDRLWSGS 

SDNTIRLWDIECGACLRVLEGHEELVRCIRFDNK 

RIVSGAYDGKIKVWDLVAALDPRAPAGTLCLRT 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, k=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





LVEHSGRVFRLQFDEFQIVSSSHDDTILrWDFLND 
PAAQSEPPRSPSRTYTY1SR 

3215 

A 

2 

1376 

EARLVGCQRGGPARPGSYSSGAETAGRAMAAN 

LSRNGPALQEAYVRVVTEKSPTDWALFTYEGNS 

NDIRVAGTGEGGLEEMVEELNSGKVMYAFCRV 

KDPNSGLPKFVLINWTGEGVNDVRKGACASHVS 

TMASFLKGAHVTINARAEEDVEPECIMEKVAKA 

SGANYSFHKESGRFQDVGPQAPVGSVYQKTNAV 

SEIKRVGKDSFWAKAEKEEENRRLEEKRRAEEA 

QRQLEQERRERELREAARREQRYQEQGGEASPQ 

RTWPOOOFVV^sR MPTsJPOP^ A VHPP PIFkTOkf PR A 
fx 1 W E,\£\£\£tl, V V oIVlNiviN l3V < /-Do/\ V nl tvJuli iv^iNJtllvM. 

MSTTSISSPQPGKLRSPFLQKQLTQPETHFGREPA 
AAISRPRADLPAEEPAPSTPPCLVQAEEEAVYEEP 
PEQETFYEQPPLVQQQGAGSEHIDHHIQGQGLSG 
QGLCARALYDYQAADDTEISFDPENLITG1EVIDE 
GWWRGYGPDGHFGMFPANYYEL1E 


A 



AA/t A QTT CVQPQPT DDT \/nD A A rtPCD A ADA T\l CW 

DPMAFFTGLWGPFTCVSRVLSHHCFSTTGSLSAI 

QKMTRVRVVDNSALGNSPYHRAPRCiHVYKKN 

GVGKVGDQILLAIKGQICXKALIVGHCMPGPRMT 

PRFDSNNVVLIEDNGNPVGTRIKTPIPTSLRKREG 

EYSKVLAIAQNFV 

3217 

A 

1 

1563 

MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNWQKLDrTWLMSNSSELMITHALERVCSVMP 

A QTTfc r 'Pr , lTT VTYTV^PQT VOI VAfcTTTPPk'VPkrPTPr 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 
CKRLLTVSSHNLESKSTKRDILVAFKGGCSILPLP 
YMQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 
GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 
NAVQHCQKHVWKEMHLHAGEHA 

3218 

A 

1 

1563 

MLCALLLLPSLLGATRASPTSGPQECAKGSTVW 

CQDLQTAARCGAVGYCQGAVWNKPTAKSLPCD 

VCQDIAAAAGNGLNPDATESDILALVMKTCEWL 

PSQESSAGCKWMVDAHSSAILSMLRGAPDSAPA 

QVCTALSLCEPLQRHLATLRPLSKEDTFEAVAPF 

MANGPLTFHPRQAPEGALCQDCVRQVSRLQEAV 

RSNLTLADLNIQEQCESLGPGLAVLCKNYLFQFF 

VPADQALRLLPPQELCRKGGFCEELGAPARLTQ 

VVAMDGVPSLELGLPRKQSEMQMKAGVTCEVC 

MNVVQKLDHWLMSNSSELMITHALERVCSVMP 

1 JvUv^llL/ vU I I jroL V V<JJL» V /A.iS.1 I r E-jv V ^J\JT IJtvJL, 

CGNRRRARAVHDAYAIVPSPEWDAENQGSFCNG 
CKRLLWSSHl^ESKSTKRDILVAFKGGCSILPLP 
YMIQCKHFVTQYEPVLIESLKDMMDPVAVCKKV 
GACHGPRTPLLGTDQCALGPSFWCRSQEAAKLC 
NAVQHCQKHVWKEMHLHAGEHA 

3219 

A 

1623 

572 

TSAEGWKGCTCTFKDRSKLREHLRSHTQEKWA 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Iso!eucine, K=Lysine, L=Leucine, M=Methiontne, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unkno>vn, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





CPTCGGMFANNTKFLDHIRRQTSLDQQHFQCSH 

CSKRFATERLLRDHMRNHVNHYKCPLCDMTCPL 

PSSLRNHMRFRHSEDRPFKCDCCDYSCKNLIDLQ 

KHLDTHSEEPAYRCDFENCTFSARSLCSIKSHYR 

KVHEGDSEPRYKCHVCDKCFTRGNNLTVHLRK 

KHQFKWPSGHPRFRYKEHEDGYMRLQLVRYES 

VELTQQLLRQPQEGSGLGTSLNESSLQGIILETVP 

GEPGRKEEEEEGKGSEGTALSASQDNPSSVIHVV 

NQTNAQGQQEIVYYYLSEAPGEPPPVPEPPSGGI 

MEKLQGIAEEPEIQMV 

3220 

A 

2760 

745 

SLGJPSGNTRGTGLVLDGDTSYTYHLVCMGPEAS 

GWGQDEPQTWPTDHRAQQGVQRQGVSYSVHA 

YTGQPSPRGLHSENREDEGWQVYRLGARDAHQ 

G RPTW ALRPEDGEDKEMKTYRLDAGDADPRRL 

CDLERERWAVIQGQAVRKSSTVATLQGTPDHGD 

PRTPGPPRSTPLEENVVDREQIDFLAARQQFLSLE 

QANKGAPHSSPARGTPAGTTPGASQAPKAFNKP 

HLANGHV VPIKPQVKG V VREENK VRA VPT WAS 

VQVVDDPGSLASVESPGTPKETPIEREIRLAQERE 

ADLREQRGLRQATDHQELVEIPTRPLLTKLSLITA 

PRRERGRPSLYVQRDIVQETQREEDHRREGLHV 

GRASTPDWVSEGPQPGLRRALSSDSILSPAPDAR 

AADPAPEVRKVNRIPPDAYQPYLSPGTPQLEFSA 

FGAFGKPSSLSTAEAKAATSPKATMSPRHLSESS 

GKPLSTKQEASKPPRGCPQANRGVVRWEYFRLR 

PLRFRAPDEPOOAOVPHVWGWEVAGAPAI RI O 

KSQSSDLLERERESVLRREQEVAEERRNALFPEV 

FSPTPDENSDQNSRSSSQASGITGSYSVSESPFFSPI 

HLHSNVAWTVEDPVDSAPPGQRKKEQWYAGIN 

PSDGINSEVLEAIRVTRHKNAMAERWESRIYASE 

EDD 

3221 

A . 

15 

478 

SRVFFFFFFFPAFKMSKRGRGGSSGAKFRISLGLP 
VGAVINCADNTGAKNLYUSVKGIKGRLNRLPAA 
GVGDMVMATVKKGKPELRKKVHPAVVIRQRKS 
YRRKDGVFLYFEDNAGVIVNNKGEMKGSAITGP 
VAKECADLWPRIASNAGSIA 

3222 

A * 

207 

1321 

PLEPLHPANRSPATMAELQEVQITEEKPLLPGQTP 

EAAKTHSVETPYGSVTFTVYGTPKPKRPAILTYH 

DVGLNYKSCFQPLFQFEDMQEIIQNFVRVHVDAP 

GMEEGAPVFPLGYQYPSLDQLADMIPCVLQYLN 

FSTIIGVGVGAGAYILARYALNHPDTVEGLVLIN1 

DPNAKGWMDWAAHKLTGLTSSIPEMILGHLFSQ 

EELSGNSELIQKYRNIITHAPNLDNIELYWNSYNN 

RRDLNFERGGDITLRCPVMLVVGDQAPHEDAVV 

ECNSKLDPTQTSFLKMADSGGQPQLTQPGKLTE 

AFKYFLQGMGYMASSCMTRLSRSRTASLTSAAS 

VDGNRSRSRTLSQSSESGTLSSGPPGHTMEVSC 

3223 

A 

132 

1664 

SARRWGAAGAGPHGLHLRAHGPRPSVRTGLPSV 

GRQAAGAAMGRGWGFLFGLLGAVWLLSSGHGE 

EQPPETAAQRCFCQVSGYLDDCTCDVETIDRFNN 

YRLFPRLQKLLESDYFRYYKVNLKRPCPFWNDIS 

QCGRRDCAVXPCQSDEVPDGIKSASYKYSEEAN 

NLIEECEQAERLGAVDESLSEETQKAVLQWTKH 

DDSSDNFCEADDIQSPEAEYVDLLLNPERYTGYK 

GPDAWKIWNVIYEENCFKPQTIKRPLNPLASGQG 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A= Ala nine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=isoieucine, K=Lysine, L= Leucine, M=Methionine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=Sertne, 
T-Thrconine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Un known, *-Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





TSEENTFYSWLEGLCVEKRAFYRLISGLHASINV 

HLSARYLLQETWLEKKWGHNITEFQQRFDGILTE 

GEGPRRLKNLYFLYLIELRALSKVLPFFERPDFQL 

FTGNKIQDEENKMLLLEILHEIKSFPLHFDENSFF 

AGDKKEAHKLKEDFRLHFRN1SR1MDCVGCFKC 

RLWGKLQTQGLGTALKILFSEKLIANMPESGPSY 

EFHLTRQEIVSLFNAFGRISYKCERIRKTSRKLLQ 

NIH 

3224 

A 

2 

803 

PGSTISWDRDAAGESGTRAASPSPSGSRTAGRLP 

SPSYSPLPAPSLFPPPPLPAPAASTMSAGGDFGNP 

LRKFKLVFLGEQSVGKTSLITRFMYDSFDNTYQA 

TIGIDFLSKTMYLEDRTVRLQLWDTAGQERFRSL 

IPSYIRDSTVAVVVYDITNLNSFQQTSKWIDDVRT 

ERGSDVIIMLVGNKTDLADKRQITIEEGEQRAKE 

LSVMFIETSAKTGYNVKQLFRRVASALPGMENV 

QEKSKEGMIDIKLDKPQEPPASEGGCSC 

3225 

A 

3 

5054 

PEVTKPSLSQPTAASPIGSSPSPPVNGGNNAKRVA 

VPNGQPPSAARYMPREVPPRFRCQQDHKVLLKR 

GQPPPPSCMLLGGGAGPPPCTAPGANPNNAQVT 

GALLQSESGTAPDSTLGGAAASNYANSTWGSGA 

SSNNGTSPNPIHIWDKVIVDGSDMEEWPCIASKD 

TESSSENTTDNNSASNPGSEKSTLPGSTTSNKGK 

GSQCQSASSGNECNLGVWKSDPKAKSVQSSNST 

TENNNGLGNWRNVSGQDRIGPGSGFSNFNPNSN 

PSAWPALVQEGTSRKGALETDNSNSSAQVSTVG 

QTSREQQSKMENAGVNFVVSGREQAQIHNTDGP 

KNGNTOSLNLSSPWMENKGMPFGMGLGNTSRS 

TDAPSQSTGDRKTGSVGSWGAARGPSGTDTVSG 

QSNSGNNGNNGKEREDSWKGASVQKSTGSKND 

SWDNNNRSTGGSWOTGPQDSNDNKWGEGNKM 

TSGVSQGEWKQPTGSDELKIGEWSGPNQPNSST 

GAWDNQKGHPLLENQGNAQAPCWGRSSSSTGS 

EVEGQSTGSNHKAGSSDSHNSGRRSYRPTHPDC 

QAVLQTLLSRTDLDPRVLSNTGWGQTQIKQDTV 

WDIEEVPRPEGKSDKGTEGWESAATQTKNSGG 

WGDAPSQSNQMKSGWGELSASTEWKDPKNTGG 

WNDYKNNNSSNWGGGRPDEKTPSSWNENPSKD 

QGWGGGRQPNQGWSSGKNGWGEEVDQTKNSN 

WESSASKPVSGWGEGGQNEIGTWGNGGNASLA 

SKGGWEDCKRSPAWNETGRQPNSWNKQHQQQ 

QPPQQPPPPQPEASGSWGGPPPPPPGNVRPSNSS 

WSSGPQPATPKDEEPSGWEEPSPQSISRKMDIDD 

GTSAWGDPNSYNYKNVNLWDKNSQGGPAPREP 

NLPTPMTSKSASDSKSMQDGWGESDGPVTGARH 

PSWEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GKKQMKCSLKGGNNDSWMNPLAKQFSNMGLL 

SQTEDNPSSKMDLSVGSLSDKKFDVDKRAMNLG 

DFNDIMRKDRSGFRPPNSKDMGTTDSGPYFEKG 

GSHGLFGNSTAQSRGLHTPVQPLNSSPSLRAQVP 

PQFISPQVSASMLKQFPNSGLSPGLFNVGPQLSPQ 

QIAMLSQLPQIPQFQLACQLLLQQQQQQQLLQN 

QRKISQAVRQQQEQQLARMVSALQQQQQQQQR 

QPGMKHSPSHPVGPKPHLDNMVPNALNVGLPDL 

QTKGPIPGYGSGFSSGGMDYGMVGGKEAGTESR 

FKQWTSMMEGLPSVATQEANMHKNGAIVAPGK 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C-Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I^Isoleucine, K= Lysine, L = Leucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V- Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





TRGGSPYNQFDIIPGDTLGGHTGPAGDSWLPAKS 

PPTNKIGSKSSNASWPPEFQPGVPWKGIQNIDPES 

DPYVTPGSVLGGTATSPIVDTDHQLLRDNTTGSN 

SSLNTSLPSPGAWPYSASDNSFTNVHSTSAKFPD 

YKSTWSPDPIGHKPTHLSNKMWKNHISSRNTTPL 

PRPPPGLTNPKPSSPWSSTAPRSVRGWGTQDSRL 

ASASTWSDGGSVRPSYWLVLHNLTPQIDGSTLRT 

ICMQHGPLLTFHLNLTQGTALIRYSTKQEAAKAQ 

TALHMCVLGNTTILAEFATDDEVSRFLAQAQPPT 

PAATPSAPAAGWQSLETGQNQSDPVGPALNLFG 

GSTGLGQWSSSAGGSSGADLAGASLWGPPNYSS 

SLWGVPTVEDPHRMGSPAPLLPGDLLGGGSDSI 

3226 

A 

200 

1387 

VPWKRQDEQLSLQVETLYLDSPAVIHLLSPTFLP 

PSSLPPFLQIVDSSSSACTLDSFFPFLAPWDSPQDC 

GFKDHQPLTLQALTVELARWTLMLLLSTAMYG 

AHAPLLALCHVDGRVPFRPSSAVLLTELTKLLLC 

AFSLLVGWQAWPQGPPPWRQAAPFALSALLYG 

ANNNLVIYLQRYMDPSTYQVLSNLKIGSTAVLY 

CLCLRHRLSVRQGLALLLLMAAGACYAAGGLQ 

VPGNTLPSPPPAAAASPMPLHITPLGLLLLELYCLI 

SGLSSVYTELLMKRQRLPLALQNLFLYTFGVLLN 

LGLHAGGGSGPGLLEGFSGWAALVVLSQALNGL 

LMSAVMKHGSSITRLFVVSCSLVVNAVLSAVLL 

RLQLTAAFFLATLLIGLAMRLYYGSR 

3227 

A 

1 

679 

RSTRARTRRPGLRAVPLPVGGFLGKMKWVWAL 

LLLAALGSGRAERDCRVSSFRVKENFDKARFSGT 

WYAMAKKDPEGLFLQDNIVAEFSVDETGQMSA 

TAKGRVRLLNNWDVCADMVGTFTDTEDPAKFK 

MKYWGVASFLQKGNDDHWIVDTDYDTYAVQY 

SCRLLNLDGTCADSYSFVFSRDPNGLPPEAQKJV 

RQRQEELCLARQYRLIVHNGYCDGRSERNLL 

3228 

A 

430 

1104 

QQESPAAGAARMNCKEGTDSSCGCRGNDEKKM 

LKCVVVGDGAVGKTCLLMSYANDAFPEEYVPT 

VFDHYAVTVTVGGKQHLLGLYDTAGQEDYNQL 

RPLSYPNTDVFLICFSVVNPASYHNVQEEWVPEL 

KDCMPHVPYVLIGTQIDLRDDPKTLARLLYMKE 

KPLTYEHGVKLAKAIGAQCYLECSALTQKGLKA 

VFDEAiLTIFHPKKKKKRCSEGHSCCSn 

3229 

A 

25 

722 

AISAGRSAKMQLKPMEINPEMLNKVLSRLGVAG 

QWRFVDVLGLEEESLGSVPAPACALLLLFPLTAQ 

HENFRKKQIEELKGQEVSPKVYFMKQTIGNSCGT 

IGLIHAVANNQDKLGFEDGSVLKQFLSETEKMSP 

EDRAKCFEKNEAIQAAHDAVAQEGQCRVDDKV 

NFHFILFNNVDGHLYELDGRMPFPVNHGASSEDT 

LLKDAAKVCREFTEREQGEVRPSAVALCKAA 

3230 

A 

282 

1479 

GDAATTACAPPDWFLGPRKLAAGPAGGGMLPR 

RLLAAWLAGTRGGGLLALLANQCRFVTGLRVR 

RAQQIAQLYGRLYSESSRRVLLGRLWRRLHGRP 

GHASALMAALAGVFVWDEERIQEEELQRSINEM 

KRLEEMSNMFQSSGVQHHPPEPKAQTEGNEDSE 

GKEQRWEMVMDKXHFKLWRRPITGTHLYQYRV 

FGTYTDVTPRQFFNVQLDTEYRKKWDALVIKLE 

VIERDVVSGSEVLHWVTHFPYPMYSRDYVYVRR 

YSVDQENNMMVLVSRAVEHPSVPESPEFVRVRS 

YESQMVIRPUKSFDENGFDYLLTYSDNPQTVFPR 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





YCVSWMVSSGMPDFLEKLHMATLKAKNMEIKV 
KDYISAKPLEMSSEAKATSQSSERKNEGSCGPAR 
IEYA 

3231 

A 

2117 

590 

FVPEPPEAGASSPCAPGDPDMSFRKVVRQSKFRH 

VFGQPVKNDQCYEDIRVSRVTWDSTFCAVNPKF 

LAV1VEASGGGAFLVLPLSKTGRJDKAYPTVCGH 

TGPVLDIDWCPHNDEVIASGSEDCTVMVWQIPE 

NGLTSPLTEPVVVLEGHTKRVGIIAWHPTARNVL 

LSAGCDNWLIWNVGTAEELYRLDSLHPDLIYN 

VSWNHNGSLFCSACKDKSVRI1DPRRGTLVAERE 

KAHEGARPMRAIFLADGKVFTTGFSRMSERQLA 

LWDPENLEEPMALQELDSSNGALLPFYDPDTSV 

VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQR 

OMrt^MPl^RrTl FV^KPFTARFYkf 1 HFRT^r'FPTVM 
\jiViVJoivir XVIvvJIjI., v orvv^lj-i rVIVr I rVJuilJClVJVVw'Iii 1 V Ivl 

TVPRKSDLFQDDLYPDTAGPEAALEAEEWVSGR 
DADPELISLREAYVPSKQRDLKISRRNVLSDSRPA 
MAPGSSHLGAPASTTTAADATPSGSLARAGEAG 
KLEEVMQELRALRALVKEQGDRICRLEEQLGRM 
ENGDA 

3232 

A 

3 

718 

RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 

fiTAriAN/fOT PWV1T P.FT I FRr;M>J<;nPT\4TnT<5QQ 
VJ 1 /\VJrVIVl\^L»L^ W V ILvjr LtLir tvVJ t liN 0\)r 1 1V1 1 i ooo 

QGGLGGLSLTTEPVSSNPGYIPSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFIV1LVVVVI 
ILVGVVSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 

3233 

A 

3 

718 

RLREDDRRGLPLSSPLWTEPPLSCCLPATYPADM 

\j l /\Vj/\lvi^L,v_* W V lJ_iVjrL«l-.r ivvjrl IN oyr 1 IVI IK^l ooo 

QGGLGGLSLTTEPVSSNPGY1PSSEANRPSHLSST 
GTPGAGVPSSGRDGGTSRDTFQTVPPNSTTMSLS 
MREDATILPSPTSETVLTVAAFGVISFIVILVWVI 
DLVGVVSLRFKCRKSKESEDPQKPGSSGLSESCST 
ANGEKDSITLISMKNINMNNGKQSLSAEKVL 

3234 

A 

1169 

4292 

AGDCGRLGVGGSEFPWEGSALGASPLPPICLQSR 

TWLLRAPAPAELGELEEVAAGRGDVWEPFLDSP 

GREESLQEASPRLADHGSSSGGGWEVKRSQRLR 

RGPSSPRRPYQDMEYERRGGRGDRTGRYGATDR 

SQDDGGENRSRDHDYRDMDYRSYPREYGSQEG 

KHDYDDSSEEQSAEDSYEASPGSETQRRRRRRH 

RHSPTGPPGFPRDGDYRDQDYRTEQGEEEEEEED 

EEEEEKASNIVMLRMLPQAATEDDIRGQLQSHG 

VQAREVRLMRNKSSGQSRGFAFVEFSHLQDATR 

WMEANQHSLN1LGQKVSMHYSDPKPK1NEDWL 

CNKCGVQNFKRREKCFKCGVPKSEAEQKLPLGT 

RLDQQTLPLGGRELSQGLLPLPQPYQAQGVLAS 

QALSQGSEPSSENANDTIILRNLNPHSTMDSILGA 

LAPYAVLSSSNVRVIKDKQTQLNRGFAFIQLSTIE 

AAQLLQILQALHPPLTIDGKTINVEFAKGSKRDM 

ASNEGSRISAASVASTAIAAAQWAISQASQGGEG 

TWATSEEPPVDYSYYOODFGYGNSOGTESSLYA 

HGYLKGTKGPGITGTKGDPTGAGPEASLEPGADS 

VSMQAFSRPQPGAAPGIYQQSAEASSSQGTAANS 

QSYTIMSPAVLKSELQSPTHPSSALPPATSPTAQE 

SYSQYPVPDVSTYQYDETSGYYYDPQTGLYYDP 

NSQYYYNAQSQQYLYWDGERRTYVPALEQSAD 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
(o last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, Il-Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





G HKETG APS KEGKEKKEKHKTKTAQQ I AKDME 

RWARSLNKQKENFKNSFQPISSLRDDERRESATA 

DAGYAILEKKGALAERQHTSMDLPKLASDDRPS 

PPRGLVAAYSGESDSEEEQERGGPEREEKLTDW 

QKLACLLCRRQFPSKEALIRHQQLSGLHKQNLEI 

HRRAHLSENELEALEKNDMEQMKYRDRAAERR 

EKYGIPEPPEPKRRKYGGISTASVDFEQPTRDGLG 

SDNIGSRMLQAMGWKEGSGLGRKKQG1VTPIEA 

QTRVRGSGLGARGSSYGVTSTESYKETLHKTMV 

TRFNEAQ 

3235 

A 

3 

1217 

PSFLNTGLGPTALGVLGGAGAGLMSNPSPQVPEE 
EASTSVCRPKSSMASTSRRQRRERRFRRYLSAGR 
LVRAQALLQRHPGLDVDAGQPPPLHRACARHD 
APALCLLLRLG ADPAHQDRHGDTALH A A A RQG 
PDAYTDFFLPLLSRCPSAMGIKNKDGETPGQILG 
WGPPWDSAEEEEEDDASKEREWRQKLQGELED j 
EWQEVMGRFEGDASHETQEPESFSAWSDRLARE 
HAOKCOOOOREAEGSCRPPRAEGSSOSWROOFF 
EQRLFRERARAKEEELRESRARRAQEALGDREP 
KPTRAGPREEHPRG A GRG SL WRFG D V P WPCPG G 
GDPEAMAAALVARGPPLEEQGALRRYLRVQQV 
RWHPDRFLQRFRSQIETWELGRVMG A VTALSQ A 
LNRHAEALK | 

3236 

A 

3 

1416 

GPASGMAEPTSDFETPIGWHASPELTPTLGPLSDT 

APPRDRWMFWAMLPPPPPPLTSSLPAAGSKPSSE 

SQPPMEAQSLPGAPPPFDAQILPGAQPPFDAQSPL 

DSQPQPSGQPWNFHASTSWYWRQSSDRFPRHQK 

SLNPAVKNSYYPRKYDAKFTDFSLPPSRKQKKK 

KRKEPVFHFFCDTCDRGFKNQEKYDKHMSEHTK 

CPELD C SFTAHEKIVQFH WRNMHAPGMKKJ KLD 

TPEEIARWREERRKNYPTLANIERKXKLKLEKEK 

RGAVLTTTOYGKMKGMSRHSOMAKIRSPGKJvfH 

KWKNDNSRQRAVTGSGSHLCDLKLEGPPEANA 

DPLGVLINSDSESDKEEKPQHSVIPKEVTPALCSL 

MSSYGSLSGSESEPEETPIKTEADVLAENQVLDSS 

APKSPSQDVKATVRNFSEAKSENRKKSFEKTNPK 

REKRLSQLSNVERTKNTPSISLGNASSSGHST 

3237 

A 

3806 

2204 

FVGEQEGGCEAGAGRGAQTYPGEAGERWFGRR 

RRRGRVVSRKKMSLKSERRGIHVDQSDLLCKKG 

CGYYGNPAWQGFCSKCWREEYHKARQKQ1QED 

WELAERLQREEEEAFASSQSSQGAQSLTFSKFEE 

KXTNEKTRKVTTVKKFFSASSRVGSKKEIQEAKA 

PSPSINRQTSIETDRVSKEFIEFLKTFHKTGQEIYK 

QTKLFLEGMHYKRDLSIEEQSECAQDF YHNV A E 

RMQTRGKVPPERVEKIMDQIEKYIMTRLYKYVF 

CPETTDDEKKDI^QKRIRALRWVTPQMLCVPV 

NEDIPEVSDMVVKAITDIIEMDSKRVPRDKLACIT 

KCSKHIFNAIKITKNEPASADDFLPTLIYIVLKGNP 

iwuivi xxA x "i ljuvi x xvx ™ x>i rxuru^i/i X-/X X X^X X A ▼ LJI.YVI 1 ™ 4. 

PRLQSNIQYITRFCNPSRLMTGEDGYYFTNLCCA 

VAFIEKLDAQSLNLSQEDFDRYMSGQTSPRKQEA 

ESWSPDACLGVKQMYKNLDLLSQLNERQERIMN 

EAKKLEKDLIDWTDGIAREVQDIVEKYPLEIKPP 

NQPLAAIDSENVENDKLPPPLQPQVYAG 

3238 

A 

1373 

449 

VLSVCPTGVFRPAPCRMAFMKKYLLPILGLFMA 
YYYYSANEEFRPEMLQGKKVIVTGASKGIGREM 


292 


WO 01/57190 


PCT/LSO 1/04098 


SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
i=isoicucinc, K=Lysinc, L=Lcucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 





AYHLAKMGAHVVVTARSKETLQKVVSHCLELG 
AASAHY1AGTMEDMTFAEQFVAQAGKLMGGLD 
MLILNHITNTSLNLFHDDIHHVRXSMEVNFLSYV 
VLTVAALPMLKQSNGSIVVVSSLAGKVAYPMVA 
AYSASKFALDGFFSS1RKEYSVSRVNVSITLCVLG 
LIDTETAMKAVSGIVHMQAAPKEECALEIIKGGA 
LRQEEVYYDSSLWTTLLIRNPCRKILEFLYSTSYN 
MDRFINK 

3239 

A 

213 

422 

ERTMQLEIKVALNF1IFYLYNKLLW/QPLKKK*EA 
HWYPDKPLKGSGFHT/GEMVDPVGELAAKRSGL 
TVED 

3240 

A 

1255 

1425 

HESYHVNPNLCNPVAPTSGAHSIG*KWPSWLGA 
VAHSCNPSTLVGRGGRITRGQELR 

3241 

A 

161 

547 

PAGIGRSTAKTPGTPGSLEMENLKSGVYPLKEAS 
GCPGADRNLLVYSFYEKGPLTFRDVAIEFSLEEW 
QCLDTAQQDLYRKVMLENYRNLVFLAG1AVSKP 
DLITCLEQGK£PWNMKRHAMVDQPPGR 

3242 

A 

50 

243 

PLPARGKSTLPATFCSPSAPELASMSVVPPNRSQT 
GWPRGVTQFGNKYIQQTKPLTLERTINL 

3243 

A 

380 

702 

FVAYLKLPFFSQVCLFASSEMFFTTSRKNMSQKLS 
LLLLVFGL1WGLMLLHYTFQQPRHQSSVKLREQI 
LDLSKRYVKALAEENKNTVDVENGASMAGYGK 
ITVEYF 

3244 

A 

37 

1391 

VLMDGRMMRSMRLREEESPGPSHTASCLCGSAP 

CILCSCCPASRNSTVSRL1FTFFLFLGVLVSIIMLSP 

GVESQLYKLPWVCEEGAGIPTVLQGHIDCGSLLG 

YRAVYRMCFATAAFFFFFTLLMLCVSSSRDPRA 

AIQNGFWFFKFL[LVGLTVGAFYIPDGSFTNIWFY 

FGVVGSFLF1LIQLVLLIDFAHSWNQRWLGKAEE 

CDSRAWYAGLFFFTLLFYLLSLAAVALMFMYYT 

EPSGCHEGKVFISLNLTFCVCVSIAAVLPKVQDA 

QPNSGLLQASVITLYTMFVTWSALSSIPEQKCNP 

HLPTQLGNETVVAGPEGYETQWWDAPSIVGLIIF 

LLCTLFISLRSSDHRQVNSLMQTEECPPMLDATQ 

QQQQVAACEGRAFDNEQDGVTYSYSFFHFCLVL 

ASLHVMMTLTNWYKPGETRKMISTWTAVWVKI 

CASWAGLLLYL 

3245 

A 

52 

426 

SSLGNEDDEILSLAKDITGMFVASHRKMRAHQV 
LTFLLLFVITSVASENASTSRGCGLDLLPQYVSLC 
DLDAIWGIVVEAAAGAGAL1TLLLMLILLVRLPF 
FKEKEKKSPVGLHFLFLLGTLGP 

3246 

A 

3 

515 

HEVCGSGCCCHCCAGGPVARQKALPRLRGVMS 

RFLNVLRSWLVMVSIIAMGNTLQSFRDHTFLYEK 

LYTGKPNLVNGLQARTFGIWTLLSSVIRCLCAIDI 

HNKTLYHITLWTFLLALGHFLSELFVYGTAAPTI 

GVLAPLMVASFSILGMLVGLRYLEVEPVSRQKK 

RN 

3247 

A 

1 

932 

ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEENSV 

THHEVKCQGKPLAGIYRKREEKRNAGNAVRSA 

MKSEEQKIKDARKGPLVPFPNQKSEAAEPPKTPP 

SSCDSTNAAIAKQALKKPIKGKQAPRKKAQGKT 

QQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELI 

ESGKEEGMKIDLIDGKGRGVIATKQFSRGDFVVE 

YHGDLIEITDAJCKREALYAQDPSTGCYMYYFQY 

LSKTYCVDATRETNRLGRLINHSKCGNCQTKLH 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lso leucine, K=Lysine, L= Leucine, M=Mcthionine, 
N=Asparaginc, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





DIDGVPHLILIASRD1AAGEELLYDYGDRSKASIE 
AHPWLKH 

3248 

A 

3 

870 

PGSTISCSELKGTQCRATAGSRGRRPPMTCWLRG 

VTATFGRPAEWPGYLSHLCGRSAAMDLGPMRK 

SYRGDREAFEETHLTSLDPVKQFAAWFEEAVQC 

PDIGEANAMCLATCTRDGKPSARMLLLKGFGKD 

GFRFFTNFESRKGKELDSNPFASLVFYWEPLNRQ 

VRVEGPVKKLPEEEAECYFHSRPKSSQIGAVVSH 

QSSVIPDREYLRKKNEELEQLYQDQEVPKPKSW 

GGYVLYPQVMEFWQGQTORLHDRIVFRRGLPTG 

DSPLGPMTHRGEEDWLYERLAP 

3249 

A 

43 

1210 

TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRG 

EEGHDPKEPEQLRKLFIGGLSFETTDDSLREHFEK 

WGTLTDCVVMRDPQTKRSRGFGFVTYSCVEEV 

DAAMCARPHKVDGRVVEPKRAVSREDSVKPGA 

HLTVKK1FVGGIKEDTEEYNLRDYFEKYGKIETIE 

VMEDRQSGKKRGFAFVTFDDHDTVDKIVVQKY 

GNFMGRGGNFGGGGGNFGRGGNFGGRGGYGG 

GGGGSRGSYGGGDGGYNGFGGDGGNYGGGPG 

YSSRGGYGGGGPGYGNQGGGYGGGGGYDGYN 

EGGNFGGGNYGGGGNYNDFGNYSGQQQSNYGP 

MKGGSFGGRSSGSPYGGGYGSGGGSGGYGSRRF 

3250 

A 

32 

1175 

VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCW1LDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIVVSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

FFIRTHKVT WGFCT GGAT YWKFI AVT AOKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 

3251 

A 

32 

1175 

VAGRGDMAALRDAEIQKDVQTYYGQVLKRSAD 

LQTNGCVTTARPVPKHIREALQNVHEEVALRYY 

GCGLVIPEHLENCWILDLGSGSGRDCYVLSQLVG 

EKGHVTGIDMTKGQVEVAEKYLDYHMEKYGFQ 

ASNVTFIHGYIEKLGEAGIKNESHDIVVSNCVINL 

VPDKQQVLQEAYRVLKHGGELYFSDVYTSLELP 

EEIRTHKVLWGECLGGALYWKELAVLAOKIGFC 

PPRLVTANLITIQNKELERVIGDCRFVSATFRLFK 

HSKTGPTKRCQVIYNGGITGHEKELMFDANFTFK 

EGEIVEVDEETAAILKNSRFAQDFLIRPIGEKLPTS 

GGCSALELKDIITDPFKLAEESDSMKSRCVPDAA 

GGCCGTKKSC 

3252 

A 

1 

574 

PLGSNTA P ALRVK4VO A WYMDD A PGDPROPHR P 

DPGRPVGLEQLRRLGVLYWKLDADKYENDPELE 

KIRRERNYSWMDUTICKDBCLPNYEEKIKMFYEE 

HLHLDDEIRYILDGSGYFDVRDKEDQWIRIFMEK 

GDMVTLPAGIYHRFTVDEKNYTKAMRLFVGEPV 

WTAYNRPADHFEARGQYVKFLAQTA 

3253 

A 

2 

984 

ARAAAHCGICRLVRWWRKRRSVMGIQTSPVLLA 
SLGVGLVTLLGLAVGSYLVRRSRJRPQVTLLDPNE 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 





KYLLRLLDKTTVSHNTKRFRFALPTAHHTLGLPV 

GKHIYLSTRIDGSLVIRPYTPVTSDEDQGYVDLVI 

V VVT i^nVHPKFPFGfiKM^OYI DST KVGDVVFF 
rv v i i^cvvj v iii rvx r x_/vj vj xvxviov^ x xjx^ox^iv v vj v vli 

RGPSGLLTYTGKGHFN1QPNKJCSPPEPRVAKKLG 

MIAGGTGITPMLQLIRAILKVPEDPTQCFLLFANQ 

TEKDIILREDLEELQARYPNRFKLWFTLDHPPKD 

WAYSKGFVTADMIREHLPAPGDDVLVLLCGPPP 

MVQLACHPNLDKLGYSQKMRFTY 

3254 

A 

1 

968 

LQSAGEGVTHVLILLESPARPVAAVTQVQRRRY 
HRLSDMSMLAERRRKQKWAVDPQNTAWSNDD 
SKFGQRMLEKMGWSKGKGLGAQEQGATDHIKV 
QVKNNHLGLGATINNEDNWIAHQDDFNQLLAEL 

YMKFTKGKDLSSRSKTDLDCIFGKRQSKKTPEG 
DASPSTPEENETTTTSAFTIQEYFAKRMAALKNK 
PO VP VPH <sFiT 9FTOVFR K T* GK K\i NK F A TO K D VF 
SYLQPKAKRHTEGKPERAEAQERVAKKKSAPAE 
EQLRGPCWDQSSKASAQDAGDHVQPA 

3255 

A 

173 

439 

GSAAMKVKIKCWNGVATWLWVANDENCGICR 
MAFMnrrpnrPcvpnnnrPT vwnnr^HfFHMHr 

\yLr\ I in vjuur ivv-/XV v r \JkJU\^r jo v w vjVy/v^oxxox xxivxi 

ILKWLHAQQVQQHCPMCRQEWKFKE 

3256 

A 

2 

377 

TAARRRQKGTAARRRQKGTLEEVVLPPRSCRVF 
WIHSGTTMSKVSFKITLTSDPRLPYKVLSVPESTP 
FTAVLKFAAEEFKVPAATSAIITNDGIGINPAQTA 
GNVFLKHGSELRIIPRDRVGSC 

3257 

A 

3 

1454 

GCSAAAAGAGSGPWAAQEKQFPPALLSFFIYNPR 

FGPREGQEENKILFYHPNEVEKNEK1RNVGLCEAI 

VQFTRTFSPSKPAKSLHTQKNRQFFNEPEENFWM 

VMVVRNPIIEKQSKDGKPVIEYQEEELLDKVYSS 

VLRQCYSMYKLFNGTFLKAMEDGGVKLLKERL 

EKFFHRYLQTLHLQSCDLLDIFGGISFFPLDKMTY 

LKIQSFINRMEESLNIVKYTAFLYNDQL1WSGLEQ 

DDMRILYKYLTTSLFPRHIEPELAGRDSPIRAEMP 

GNLQHYGRFLTGPLNLNDPDAKCRFPKIFVNTD 

Ul I SZCt-tLkl^l V X IVrVlVlo/VrV V v^X LVLLLJf\iJ V til X L-ilSl 

R^LDSIVGPQLTVLASDICEQFNINKRMSGSEKEP 

QFKFIYTOHMNLAEKSTVHMRKTPSVSLTSVHPD 

LMKlLGDmSDFTRVDEDEEIIVKAMSDYWVVG 

KKSDRMLYVILNQKNANL1EVNEEVKKLCATQF 

NN1FFLD 

3258 

A 

113 

1558 

APRGCSNlTrl^KXKPFIEKKKAVSFHLVHRSQRD 

PLAADESAPQRVLLPTQKIDNEERRAEQRKYGVF 

FDDDYDYLQHLKEPSGPSELIPSSTFSAHNRREEK 

EETLVIPSTGIKLPSSVFASEFEEDVGLLNKAAPV 

SGPRLDFDPDIVAALDDDFDFDDPDNLLEDDFIL 

QANKATGEEEGMDIQKSENEDDSEWEDVDDEK 

GDSNDDYDSAGLLSDEDCMSVPGKTHRA1ADHL 

FWSEETKSRFTEYSMTSSVMRRNEQLTLHDERFE 

KFYEQYDDDEIGALDNAELEGSIQVDSNRLQEVL 

MDYYKEKAENCVKLNTLEPLEDQDLPMiNnELDES 

EEEEIvnTVVLEEAKEKWDCESICSTYSNLYNHPQ 

LIKYQPKPKQIR1SSKTGIPLNVLPKKGLTAKQTE 

RIQMINGSDLPKVSTQPRSKNESKEDKRARKQAI 

KJEERKERRVEKKANKLAFKLEKl^QEKELLNLK 

KNVEGLKL 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arg»nine, 5=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 

3259 

A 

3 

964 

QMEPGNDTQISEFLLLGFSQEPGLQPFLFGLFLSM 

YLVTVLGNLLIILATISDSHLHTPMYFFLSNLSFA 

DICVTSTTIPKMLMNIQTQNKV1TYIACLMQMYF 

FILFAGFENFLLSVMAYDRFVAICHPLHYMVIMN 

PHLCGLLVLASWTMSALYSLLQILMVVRLSFCT 

ALEIPHFFCELNQVIQLACSDSFLNHMVIYFTVAL 

LGGGPLTGILYSYSKIISSIHAISSAQGKYKAFSTC 

ASHLSVVSLFYGAILGVYLSSAATRNSHSSATAS 

VMYTVVTPMLNPFIYSLRNKDIKRALGIHLLWGT 

MKGQFFKKCP 

3260 

A 

34 

2573 

IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTE 

HGTPKPFRKFDSVAFGESQSEDEQFENDLETDPP 

NWQQLVSREVLLGLKPCEIKRQEVINELFYTERA 

HVRTLKVLDQVFYQRVSREGILSPSELRKIFSNLE 

DILQLHIGLhffiQMKAVRKRNETSVlDQIGEDLLT 

WFSGPGEEKLKHAAATFCSNQPFALEMIKSRQK 

KDSRFQTFVQDAESNPLCRRLQLKDIIPTQMQRL 

TKYPLLLDNIATYTEWPTEREKVKKAADHCRQIL 

NYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEY 

PNVEELRNLDLTKJIKMIHEGPLVWKVNRDKTID 

LYTLLLEDILVLLQKQDDRLVLRCHSKILASTAD 

SKHTFSPVIKLSTVLVRQVATDNKALFVISMSDN 

GAQIYELVAQTVSEKTVWQDLICRMAASVKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTG 

LQSPDRDLGLESTLISSKPQSHSLSTSGKSEVRDL 

FVAERQFAKEQHTDGTLKEVGEDYQIAIPDSHLP 

VSEERWALDALRNLGLLKQLLVQQLGLTEKSVQ 

EDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSG 

EGHMPFRTGTGDIATCYSPRTSTESFAPRDSVGL 

APQDSQASMLVMDHM1MTPEMPTMEPEGGLDD 

SGEHFFDAREAHSDENPSEGDGAVNKEEKDVNL 

RISGNYLILDGYDPVQESSTDEEVASSLTLQPMT 

GIPAVESTHQQQHSPQNTHSDGAISPFTPEFLVQQ 

RWGAMEYSCFEIQSPSSCADSQSQIMEYIHKIEA 

DLEHLKKVEESYTILCQRLAGSALTDKHSDKS 

3261 

A 

I 

2100 

AVEFAEGALTMAPWPELGDAQPNPDKYLEGAA 

GQQPTAPDKSKETNKTDNTEAPVTKIELLPSYST 

ATLEDEPTEVDDPWNLPTLQDSGIKWSERDTKGK 

ILCFFQGIGRLILLLGFLYFFVCSLDILSSAFQLVG 

GKMAGQFFSNSSIMSNPLLGLVIGVLVTVLVQSS 

STSTSIVVSMVSSSLLTVRAAIPIIMGANIGTSITNT 

IVALMQVGDRSEFRRAFAGATVHDFFNWLSVLV 

LLPVEVATHYLEIITQLIVESFHFKNGEDAPDLLK 

VITKPFTKLIVQLDKKVISQIAMNDEKAKNKSLV 

KIWCKTFTNKTQINVTVPSTANCTSPSLCWTDGI 

QNWTMKNVTYKEN1AKCQHIFVNFHLPDLAVGT 

ILLILSLLVLCGCLIMIVK1LGSVLKGQVATVIKKT 

rNTDFPFPFAWLTGYLAILVGAGMTFIVQSSSVFT 

S ALTPLIGIG VITIERA YPLTLG SNIGTTTTAIL AAL 

ASPGNALRSSLQIALCHFFFNISGILLWYPIPFTRL 

PIRMAKGLGNISAKYRWFAVFYLIIFFFLIPLTVFG 

LSLAGWRVLVGVGVPVVFIIILVLCLRLLQSRCPR 

VLPKKLQNWNFLPLWMRSLKPWDAVVSKFTGC 

FQMRCCCCCRVCCRACCLLCGCPKCCRCSKCCE 

DLEEAQEGQDVPVKAPETFDNITISREAQGEVPA 
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SEQU) 
NO: 

Method 

Predicted 
beginning 
nucleotide 

Ion Hon 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 

Predicted end 

nucleotide 

location 

rn rrp^nnnd i nc 

to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, HHHistidtne, 
l-lsoleucine, K=Lysine, i/=Leucine, MNVfethionine, 
N=Asparagine, P^Proline, Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *-Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 





SDSKTECTAL 

3262 

A 

30 

1377 

SQQGSQPHRQGPPSLLTAPHSLDLPALPPGPRGS 

QGKLRRVLVPMSVKPSWGPGPSEGVTAVPTSDL 

GE1HNWTELLDLFNHTLSECHVELSQSTICRVVLF 

ALYLAMFVVGLVENLLVICVNWRGSGRAGLMN 

LYILNMAIADLGIVLSLPVWMLEVTLDYTWLWG 

SFSCRFTHYFYFVNMYSSIFFLVCLSVDRYVTLTS 

ASPSWQRYQHRVREAMCAGIWVLSAIIPLPEVV 

HIQLVEGPEPMCLFMAPFETYSTWALAVALSTTI 

LGFLLPFPLITVFNVLTACRLRQPGQPKSRRHCLL 

LCAYVAVFVMCWLPYHVTLLLLTLHGTHISLHC 

HLVHLLYFFYDVIDCFSMLHCVINPILYNFLSPHF 

RGRLLNAVVHYLPKDQTKAGTCASSSSCSTQHSI 

IITKGDSQPAAAAPHPEPSLSFQAHHLLPNTSP1SP 

TQPLTPS 

3263 

A 

1 

919 

QARSPSVAAMASPQLCRALVSAQWVAEALRAP 

RAGQPLQLLDASWYLPKLGRDARREFEERHIPG 

AAFFD1DQCSDRTSPYDHMLPGAEHFAEYAGRL 

GVGAATHVVIYDASDQGLYSAPRVWWMFRAFG 

HHAVSLLDGGLRHWLRONLPLSSGKSOPAPAEF 

RAQLDPAFIKTYEDIKENLESRRFQVVDSRATGR 

FRGTEPEPRDGIEPGHIPGTVN1PFTDFLSQEGLEK 

SPEEIRHLFQEKKVDLSKPLVATCGSGVTACHVA 

LGAYLCGKPDVPIYDGSWVEWYMRARPEDVISE 

GRGKTH 

3264 

A 

1 

1398 

ARRSTPRTAPRASATRSAAGTMREIVHIQAGQCG 

NQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERI 

NVYYNEAAGNKYVPRA1L VDLEPGTMDS VRSG P 

FGQIFRPDNFVFGQSGAGNNWAKGHYTEGAELV 

DSVLDVVRKESESCDCLQGFQLTHSLGGGTGSG 

MGTLLISKIREEYPDRIMNTFSVMPSPKVSDTVVE 

PYNATLSVHQLVENTDETYSIDNEALYDICFRTL 

KLTTPTYGDLNHLVSATMSGVTTCLRFPGQLNA 

DLRKLAVNMVPFPRLHFFMPGFAPLTSRGSQQY 

RALTVPELTQQMFDSKNMMAACDPRHGRYLTV 

AAIFRGRMSMKEVDEQMLNVQNKNSSYFVEWIP 

NNVKTAVCDIPPRGLKMSATFIGNSTAIQELFKRI 

SEQFTAMFRRKAFLHWYTGEGMDEMEFTEAES 

NMNDLVSEYQQYQDATADEQGEFEEEEGEDEA 

326S 

A 

A 

265 

862 

WWEDARVLGPFHPEEEGHWVMTPSEGARAGTG 
RELEMLDSLLALGGLVLLRDSVEWEGRSLLKAL 
VKKSALCGEQVHILGCEVSEEEFREGFDSDrNNR 
LVYHDFFRDPLNWSKTEEAFPGGPLGALRAMCK 
RTDPWVTIALDSLSWLLLRLPCTTLCQVLHAVS 
HQDSCPGETPPSLFPLIHLPLPRSVPLFLSTLE 

3266 

A 

2 

884 

AAGAGADGREPASERASRAEPPAVAMGQNDLM 

GTAEDFADQFLRVTKQYLPHVARLCLISTFLEDG 

IRMWFQWSEQRDYIDTTWNCGYLLASSFVFLNL 

LGQLTGCVLVLSRNFVQYACFGLFGIIALQTIAYS 

ILWDLKPLMRNLALGGGLLLLLAESRSEGKSMF 

AGVPTMRESSPKQYMQLGGRVLLVLMFMTLLH 

FDASFFSIVQNIVGTALMILVA1GFKTKLAALTLV 

VWLFAINVYFNAFWTIPVYKPMHDFLKYDFFQT 

MSVIGGLLLVVALGPGGVSMDEKKKEW 

3267 

A 

802 

1011 

ASTFCSAWKRRSTAALWWSGSRASRSHPRELGP 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l — lsuicuiine, iv— Lysine, i> — Leucine, ivi — jvieuiionme, 
N^Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





LCFVFGTAALS1RSMDVLSLFLEHGKLVFASGLSP 
RA 

3268 

A 

490 

679 

EDAWITNPSLSNARSTPSKPLCYTVLKEGQVVGV 
KTTKASNTREKLRPESERRMVKSFGDEVT 

3269 

A 

2 

796 

GSTHASGARPSLKRARSQRGRPLPSRALPSAHKD 

MTTNAGPLHPYWPQHLRLDNFVPNDRPTWHILA 

GLFSVTGVLVVTTWLLSGRAAVVPLGTWRRLSL 

CWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQ 

LWKEYAKGDSRYILGDNFTVCMETITACLWGPL 

SLWVVIAFLRQHPLRFILQLVVSVGQIYGDVLYF 

LTEHRDGFQHGELGHPLYFWFYFVFMNALWLV 

LPGVLVLDAVKHLTHAQSTLDAKATKAKSKKN 

3270 

A 

17 

229 

GDTGPQILMSYLDSVASKLLQMVKKLSQSFCSNF 
KYLTKYS11KQVSDEIKKSRRTVESNPIFFKKNKKI 
Q 

3271 

A 

419 

553 

IQSGLSLCFADLSETPEGRAGVPGCPHSCDGVAS 
GRPCSPSSAG 

3272 

A 

1211 

1450 

FQFIQIELLNILQSLIRNQTQSPYNTTAYPAIDSVIT 
ILPFSFSCFFIITKCFGLSIFPSVIFFLHVYFILTLVVF 
YCC 

3273 

A 

59 

1562 

QAWSLQVALSPFFFPASPSNSFAAAVPQLLFPELP 

LPHVPGQESAKRRSARRFLIMSELTKELMELVW 

GTKSSPGLSDTIFCRWTQGFVFSESEGSALEQFEG 

GPCAVIAPVQAFLLKJCLLFSSEKSSWRDCSQEEQ 

KELLCHTLCDILESACCDHSGSYCLVSWLRGKTT 

EETASISGSPAESSCQVEHSSALAVEELGFERFHA 

LIQKRSFRSLPELKDAVLDQYSMWGNKFGVLLF 

LYSVLLTKGIENIKNEIEDASEPLIDPVYGHGSQS 

LINLLLTGHAVSNVWDGDRECSGMKLLGIHEQA 

AVGFLTLMEALRYCKVGSYLKISKIPYLDCLASE 

THLTVFFAKDMALVAPEAPSEQARRVFQTYDPE 

DNGFIPDSLLEDVMKALDLVSDPEYINLMKNKL 

DPEGLGIILLGPFLQEFFPDQGSSGPESFTVYHYN 

GLKQSNYNEKVMYVEGTAVVMGFEDPMLQTD 

DTPIKRCLQTKWPYIELLWTTDRSPSLN 

3274 

A 

186 

1358 

RVVHRFFKSSAFWPAEVKQPRGGPKTGSRKEGA 

GSRAPQPWRSFCGSVGAEGRMEKLRLLGLRYQ 

EYVTRHPAATAQLETAVRGFSYLLAGRFADSHE 

LSELVYSASNLLVLLNDGILRKELRKKLPVSLSQ 

QKLLTWLSVLECVEVFMEMGAAKVWGEVGRW 

LVIALIQLAKAVLRMLLLLWFKAGLQTSPPIVPL 

DRETQAQPPDGDHSPGNHEQSYVGKRSNRVVRT 

LQNTPSLHSRHWGAPQQREGRQQQHHEELSATP 

TPLGLQETIAEFLY1ARPLLHLLSLGLWGQRSWK 

PWLLAGVVDVTSLSLLSDRKGLTRRERRELRRR 

TILLLYYLLRSPFYDRFSEARILFLLQLLADHVPG 

VGLVTRPLMDYLPTWQKIYFYSWG 

3275 

A 

575 

759 

SVYSASSCKCCNYRKTEQIPDCEQPPASSMPERPS 
HESQPTPQMMPLSAPSRAEELGQRPG 

3276 

A 

7 

258 

KAAGHRLLLAAGHPSMPSSDCLLWEGSLELRPL 
QHISSLLVLVSTTCLFAFPRVPJAFESKSCLIYHCH 
CAFTVRHYMCSSHTG 

3277 

A 

9 

2221 

KLGVEPEEEGGGDDEEDAEAWAMELADVGAAA 

SSQGVHDQVLPTPNASSRVIVHVDLDCFYAQVE 

MISNPELKDKPLGVQQKYLVVTCNYEARKLGVK 


298 


WO 01/57190 


PCT7U SO 1/04098 


SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L-Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





KLMNVRDAKEKCPQLVLVNGEDLTRYREMSYK 

VTELLEEFSPVVERLGFDENFVDLTEMVEKRLQQ 

LQSDELSAVTVSGHVYNNQSINLLDVLHIRLLVG 

SQIAAEMREAMYNQLGLTGCAGVASNKLLAKL 

VSGVFKPNQQTVLLPESCQHLIHSLNHIKEIPGIG 

YKTAKCLEALGINSVRDLQTFSPKILEKELGISVA 

QRIQKLSFGEDNSPVILSGPPQSFSEEDSFKKCSSE 

VEAKNKIEELLASLLNRLCQDERKPHTVRLIIRRY 

SSEKHYGRESRQCPIPSHVIQKLGTGNYDVMTPM 

VDILMKLFRNMVNVKMPFHLTLLSVCFCNLKAL 

NTAKJCGLIDYYLMPSLSTTSRSGKHSFKMKDTH 

MEDFPKDKET^mDFLPSGRIESTRTRESPLDTTNF 

SKEKDINEFPLCSLPEGVDQEVFKQLPVDIQEEIL 

SGKSREKFOGKGSVSCPLHASRGVT SFFSKKOM 

QDIPINPRDHLSSSKQVSSVSPCEPGTSGFNSSSSS 

YMSSQKDYSYYLDNRLKDERISQGPKEPQGFHF 

TNSNPAVSAFHSFPNLQSEQLFSRNHTTDSHKQT 

VATDSHEGLTENREPDSVDEKITFPSDIDPQVFYE 

LPEAVQKELLAEWKRTGSDFHIGHK 

3278 

A 

1 

876 

GLRLHVDLVEKPRTGIMAAETRNVAGAEAPPPQ 
KRYYRQRAHSNPMADHTLRYPVKPEEMDWSEL 
YPEFFAPLTQNQSHDDPKDKKEKRAQAQVEFAD 

DRJRALRAAPAGGFQNIACLRSNAMKHLPNFFY 

KGQLTKMFFLFPDPHFK^TKHKWRIISPTLLAEY 

AYVLRVGGLVYTITDVLELHDWMCTHFEEHPLF 

ERVPLEDLSEDPVVGHLGTSTEEGKKVLRNGGK 

NFPAIFRRJQDPVLQAVTSQTSLPGH 

3279 

A 

82 

2929 

trtkrrlgrekamaspprgwgcgelllpfmllg 

tlcepgsgqirysmpeeldkgsfvgniakdlgle 

pqelaergvrivsrgrtqlfalnprsgslvtagri 

dreelcaqsplcvvnmilvenkmkiygveveii 

dindnfprfrdeelkvkvnenaaagtrlvlpfa 

rdadvgvnslrsyqlssnlhfsldvvsgtdgqk 

ypelvleqpldreketvhdllltaldggdpvlsg 

tthirvtvldandnaplftpseysvsvpenipvgt 

rllmltatdpdegingkltysfrneeekisetfql 

dsnlgeistlqsldyeesrfylmewaqdggal 

vasakvwtvqdvndnapev1ltsltssisedcl 

pgtvialfsvhdgdsgengeiacsiprnlpfklek 

svdnyyhllttrdldreetsdynitltvmdhgt 

pplsteshiplkvadvndnppnfpqasystsvten 

nprgvsifsvtahdpdsgdnarvtyslaedtfqg 

aplssyvshmsdtgvlyalrsfdyeqlrdlqlwv 

tasdsgnpplssnvslslfvldqndntpeilypal 

ptdgstgvelaprsaepgylvtkvvavdkdsgq 

nawlsyrllkasepglfavglhtgevrtarall 

drdalkqslvvavedhgqpplsatftvtvavad 

ripdiladlgsiktpidpedldltlylvvavaavs 

CVFLAFVIVLLVLRLRRWHKSRLLQAEGSRLAG 

VPASHFVGVDGVRAFLQTYSHEVSLTADSRKSH 

LIFPQPNYADTLLSEESCEKSEPLLMSDKVDANK 

EERRVQQAPPNTDWRFSQAQRPGTSGSQNGDDT 

GTWPNNQFDTEMLQAMILASASEAADGSSTLGG 

GAGTMGLSARYGPQFTLQHVLQGELGSDYRQN 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, LHLeutine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T— Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possiblc nucleotide insertion 





VYIPGSNATLTNAAGKRDGKAPAGGNGNKKXS 
GKKEKK 

3280 

A 

149 

1288 

GTSQMSSHKGSVVAQGNGAPASNREADTAELAE 

LGPLLEEKGKRVIANPPKAEEEQTCPVPQEEEEE 

VRVLTLPLQAHHAMEKMEEFVYKVWEGRWRV1 

PYDVLPDWLKDNDYLLHGHRPPMPSFRACFKSIF 

RIHTETGNIWTHLLGFVLFLFLGILTMLRPNMYF 

MAPLQEKVVFGMFFLGAVLCLSFSWLFHTVYCH 

VQRTFQk r 7 nV<iriTAr J nV/fnQPTVPWT vvQrvrc 
oCJV V OIV 1 r Or^L,U I oOI/\JLL(HVlVJor VrWLi i or I L^o 

PQPRLIYLSIVCVLGISAIIVAQWDRFATPKHRQT 
RAGVFLGLGLSGWPTMHFTIAEGFVKATTVGQ 
MGWFFLMAVMYITGAGLYAARIPERFFPGKFDI | 
WFQSHQIFHVLVVAAAFVHFYGVSNLQEFRYGL 
EGGCTDDTLL 

_>Z.O 1 

A 

l 

JD / 

KrKlUvl^roroCKVLVlJbJ^ 1 NoMNyblvLA 

KLQAQVRIGGKGTARRKKKVVHRTATADDKKL 

QSSLKKLAVNNIAGIEEVNMIKDDGTVIHFNNPK 

VQASLSANTFAITGHAEAKPITEMLPGILSQLGAD 

SLTSLRKLAEQFPRQVLDSKAPKPEDIDEEDDDV 

PDLVENFDEASKNEAN 

3282 

A 

155 

1139 

HALGRRGGSQELSAAACGCFALRLRAPGSGRPA 

L APG AAAF AGLGG APRFPPRG S A AG RTMLLKE Y 

RJCMPLTVDEYKIGQLYMISKHSHEQSDRGEGVE 

WQNEPFEDPHHGNGQFTEKRVYLNSKLPSWAR 

AV VrJvlr i V liilvAWJN YYrYlllfci 1 CorLrlvrolrl 

IETKYEDNKGSNDTIFDNEAKDVEREVCF1DIACD 

E1PERYYKESEDPKHFKSEKTGRGQLREGWRDSH 

QPIMCSYKLVTVKFEVWGLQTRVEQFVHKVVR 

i^iivi^iwni\A^/\r/\ w v w i jljivi i wiuu v ivc. i jc.iv.Ln 

MHEQTN1KVCNQHSSPVDDIESHAQTST 

3283 

A 

159 

547 

IKSKLNQQVEVQESEWRLTEAKGPTMGKESGW 

DSGRAAVAAVVGGVVAVGTVLVALSAMGFTSV 

GIAASSIAAKMMSTAAIANGGGVAAGSLVAILQS 

V \Jr\J\\JL*0 V 1 orv V lOvJr/Vvj 1 r\lAJ/\ W Luorroj 

3284 

A 

227 

637 

TSNSLLRPDRMSVMDLANTCSSFQSDLDFCSDCG 

SVLPLPGAQDTVTCIRCGFNINVRDFEGKVVKTS 

WFHQLGTAMPMSVEEGPECQGPVVDRRCPRCG 

HEG3VIAYHTRQMRSADEGQTVFYTCTNCKFQEK 

EDS 

3285 

A 

123 

1535 

* 

HRLSYDEAFAMANDPLEGFHEVNLASPTSPDLL 

GVYESGTQEQTTSPSVIYRPHPSALSSVP1QANAL 

DVSELPTQPVYSSPRRLNCAEISSISFHVTDPAPCS 

TSGVTAGLTKLTTRKDNYNAEREFLQGATITEAC 

DGSDDIFGLSTDSLSRLRSPSVLEVREKGYERLKE 

ELAKAQRELKLKDEECERLSKVRDQLGQELEEL 

TASLFEEAHKMVREANIKQATAEKQLKEAQGKI 

DVLQAEVAALKTLVLSSSPTSPTQEPLPGGKTPF 

iSJVvjxl 1 KIN JVo 1 ooAMoLJorlv^JJJ-/0 V A^rl VJSJJL/RilA 

DLSLYNEFRLWKDEPTMDRTCPFLDKIYQEDIFP 
CLTFSKSELASAVLEAVENNTLSIEPVGLOPTRFV 
KASAVECGGPKXCALTGQSKSCKHRIKLGDSSN 
YYYISPFCRYR1TSVCNFFTYIRYIQQGLVKQQDV 
DQMFWEVMQLRKEMSLAKLGYFKEEL 

3286 

A 

3 

589 

GPSQSMAAGELEGGKPLSGLLNALAQDTFHGYP 
GITEELLRSQLYPEVPPEEFRPFLAKMRGILKS1AS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *-Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





ADMDFNQLEAFLTAQTKKQGGITSDQAAVISKF 
WKSHKTKIRESLMNQSRWNSGLRGLSWRVDGK 
^O^PH^AOTHTPVATIFI FT CtKYGOFSFFT PT FFD 
EVKVNQILKTLSEVEESISTLISQPN 

3287 

A 

50 

390 

LGAMAKHHPDLIFCRKQAGVAIGRLCEKCDGKC 
VICDSYVRPCTLVR1CDECNYGSYQGRCVICGGP 
GVSDAYYCKECTIQEKDRDGCPKIVNLGSSKTDL 
FYFRKKYGFKKR 

3288 

A 

3 

428 

RTTFFRFRPCESLCGDMKLLTHNLLSSHVRGVGS 

RGFPLRLQATEVRICPVEFNPNFVARMIPKVEWS 

AFLEAADNLRLIQVPKGPVEGYEENEEFLRTMH 

HLLLEVEVIEGTLQCPESGRMFPISRGIPNMLLSE 

EETES 

3289 

A 

1 

1743 

AGCCRDTRFPTPRGPGSLCHNFCRSAACTVTRTI 

HGSPREDTGTPRSREMMFQDSVAFEDVAVSFTQ 

EEWALLDPSQKNLYRDVMQETFKNLTSVGKTW 

KVQNIEDEYKNPPIRNLSLMREKLCESKESHHCG 

ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGH 

SSLNTHIRADTGHKSSEYQEYGENPYRNKECKK 

AFSYLDSFQSHDKACTKEKPYDGKECTETF1SHS 

CIQRHRVMHSGDGPYKCKFCGKAFYFLNLCLIH 

ERIHTGVKPYKCKQCGKAFTRSTTLPVHERTHTG 

VNADECKECGNAFSFPSEIRRHKRSHTGEKPYEC 

KQCGKVFISFSSIQYHKMTHTGEKPYECKQCGK 

AFRCGSHLQKHGRTHTGEKPYECRQCGKAFRCT 

olJlA^r\xl£}ls. 1 rl 1 EtUlsJr i OOJvl^L,OlvOrKL,Aovljv^l 

HERTHSGEKPHECKECGKVFKYFSSLRIHERTHT 

GEKPHECKQCGKAFRYFSSLHIHERTHTGDKPYE 

CKVCGKAFTCSSSIRYHERTHTGEKPYECKHCGK 

AFISNYIRYHERTHTGEKPYQCKQCGKAFIRASS 

CREHERTHTINR 

3290 

A 

2 

1350 

GRPRSSSDIN^NFLRERAGLSSAAVQTRIGNSAAS 

RRSPAARPPVPAPPALPRGRPGTEGSTSLSAPAVL 

WAVAVVVWVSAVAWAMANYIHVPPGSPEVP 

KLNVTVQDQEEHRCREGALSLLQHLRPHWDPQE 

VTLQLFTDGITNKLIGCYVGNTMEDVVLVRIYGN 

KTELLVDRDEEVKSFRVLQAHGCAPQLYCTFNN 

GLCYEFIQGEALDPKHVCNPA1FRL1ARQLAKIHA 

IHAHNGWIPKSNLWLKMGKYFSLIPTGFADEDIN 

NDLLCKNIIYNEKQGDVQFIDYEYSGYNYLAYDI 

GNHFNEFAGVSDVDYSLYPDRELQSQWLRAYLE 

AYKEFKGFGTEVTEKEVE1LF1QYNQFALASHFF 

WGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMK 

PEVTALKVPE 

3291 

A 

102 

839 

PEAQTSAVLAREKGHLPTiMRHEAPMQMASAQD 

ARYGQKDSSDQNFDYMFKLLI1GNSSVGKTSFLF 

iv i fxL/LJor i o/vr vol v ULUrivV fvi v rjviN iiivrvi jvla^i 

WDTAGQERYRTITTAYYRGAMGFILMYDITNEE 

SFNAVODWSTOIKTYSWDNAOVILVGNKCDME 

DERVISTERGQHLGEQLGFEFFETSAKDNINVKQ 

TFERLVDIICDKMSESLETDPAITAAKQNTRLKET 

PPPPQPNCAC 

3292 

A 

2 

4136 

DRPPWNSRVDDFVTNLIHLSSKGHISPAKDTSLQ 
QRTPAEMSPVIJriFYVRPSGHEGAASGHTRRKLQ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenyla!anine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L^Leucine, M~Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





GKLPELQGVETELCYNVNWTAEALPSAEETKKL 

MWLFGCPLLLDDVARESWLLPGSNDLLLEVGPR 

LNFSTPTSTNIVSVCRATGLGPVDRVETTRRYRLS 

FAHPPSAEVEAIALATLHDRMTEQHFPHPIQSFSP 

ESMPEPLNGPINILGEGRLALEKANQELGLALDS 

WDLDFYTKRFQELQRNPSTVEAFDLAQSNSEHS 

RHWFFKGQLHVDGQKLVHSLFESIMSTQESSNP 

NNVLKFCDNSSAIQGKEVRFLRPEDPTRPSRFQQ 

QQGLRHVVPTAETHNFPTGVCPFSGATTGTGGRI 

RDVQCTGRGAHVVAGTAGYCFGNLHPGYNLP 

WEDLSFQYPGNFARPLEVA1EASNGASDYGNKF 

GEPVLAGFARSLGLQLPDGQRREWIKPIMFSGGI 

GSMEADHISKEAPEPGMEWKVGGPVYRIGVGG 

GAASSVQVQGDNTSDLDFGAVQRGDPEMEQKM 

NRVIRACVEAPKGNPICSLHDQGAGGNGNVLKE 

LSDPAGAIIYTSRFQLGDPTLNALEIWGAEYQESN 

ALLLRSPNRDFLTHVSARERCPACFVGTITGDRRI 

VLVDDRECPVRRNGQGDAPPTPPPTPVDLELEW 

VLGKMPRKEFFLQRKPPMLQPLALPPGLSVHQA 

LER VLRLPA VASKR YLTNKVDRS VGGL VAQQQC 

VGPLQTPLADVAVVALSHEELIGAATALGEQPV 

KSLLDPKVAARLAVAEALTNLVFALVTDLRDVK 

CSGNWMWAAKLPGEGAALADACEAMVAVMA 

ALGVAVDGGKDSLSMAARVGTETVRAPGSLVIS 

AYAVCPDITATVTPDLKHPEGRGHLLYVALSPG 

QHRLGGTALAQCFSQLGEHPPDLDLPENLVRAFS 

ITQGLLKDRLLCSGHDVSDGGLVTCLLEMAFAG 

NCGLQVDVPVPRVDVLSVLFAEEPGLVLEVQEP 

DLAQVLKRYRDAGLHCLELGHTGEAGPHAMVR 

VSVNGAVVLEEPVGELRALWEETSFQLDRLQAE 

PRC VA EEERGLRERMGPSYCLPPTFPKAS VPREP 

GGPSPR V A ILREEGSNGDREMADAFHL AGFEVW 

DVTMQDLCSGAIGLDTFRGVAFVGGFSYADVLG 

SAKGWAAAVTFHPRAGAELRRFRKRPDTFSLGV 

r^XT/^rr^T t at t wrv/^or^DXTnr^ a AtJ\jrrDncr»DAi? 
LNUt^LLALLu W V UOl^rNcJJAAiiMOrUov^r AK 

PGLLLRHNLSGRYESRWASVRVGPGPALMLRG 

MEGAVLPVWSAHGEGYVAFSSPELQAQIEAJR.GL 

APLHWADDDGNPTEQYPLNPNGSPGGVAGICSC 

DGRHLAVMPHPERAVRPWQWAWRPPPFDTLTT 

SPWLQLFINARNWTLEGSC 

3293 


UJ 


nVRHFWAnTMAQP ArjPR A Ar^TnnQr»POT-TPPRV 
VJ v rvvjr w fAKJ I lviyA o tv7\0 1 ivrv/\ O l J-yvjol-vr V^JrlJtvJciJV V 

AMHYQMSVTLKYEIKXLIWHLVIWLLLVAKMS 

VGHLRLLSHDQVAMPYQWEYPYLLSILPSLLGLL 

SFPRNNISYLVLSMISMGLFSIAPLIYGSMEMFPA 

AQQLYRHGKAYRFLFGFSAVSIMYLVLVLAVQV 

HAWQLYYSKKLLDSWFTSTQEKKHK 

3294 

A 

35 

1821 

SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRA 
WSAGGPALGLMAAPVRLGRKRPLPACPNPLFVR 
WLTEWRDEATRSRHRTRFVFQKALRSLRRYPLP 

DHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQP 

KAGGSGSYWPARHSGARVBLLVLYREHLNPNGH 

HFLTKEELLQRCAQKSPRVAPGSARPWPALRSLL 

HRNLVLRTHQPARYSLTPEGLELAQKLAESEGLS 

LLNVGIGPKEPPGEETAVPGAASAELASEAGVQQ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutnmic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
}=isi>ieucine, Kr=JLysine 7 L=Lcueiae, M—Methionint, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine,S=Sertne, 
T=Threonine, V=Valine, YV=Tryptophan, Y=Tyrosine, 
X=llnkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





QPLELRPGEYRVLLCVDIGETRGGGHRPELLREL 

QRLHVTHTVRKLHVGDFVWVAQETNPRDPANP 

GELVLDHIVERKRLDDLCSSI1DGRFREQKFRLKR 

CGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQ 

V1DGFFVKRTADIKESAAYLALLTRGLQRLYQGH 

TLRSRPWGTPGNPESGAMTSPNPLCSLLTFSDFN 

AGAIKNKAQSVREVFARQLMQVRGVSGEKAAA 

LVDRYSTPASLLAAYDACATPKEQETLLSTIKCG 

RLQRNLGPALSRTLSQLYCSYGPLT 

3295 

A 

2 

1115 

EFHPHTQVSGLLTPQLQEPDVWSPSRGQPVSLHL 

PGKGAPEVKEMAWWKSW1EQEGVTVKSSSHFN 

PDPDAETLYKAMKGIGTNEQAIIDVLTKRSNTQR 

QQIAKSFKAQFGKDLTETLKSELSGKFERL1VAL 

MYPPYRYEAKELHDAMKGLGTKEGVIIEILASRT 

KNQLREIMKAYEEDYGSSLEEDIQADTSGYLERI 

LVCLLQGSRDDVSSFVDPALALQDAQDLYAAGE 

KIRGTDEMKFITILCTRSATHLLRVFEEYEKIANK 

SIEDSIKSETHGSLEEAMLTVVKCTQNLHSYFAE 

RLYYAMKGAGTRDGTL1RNIVSRSEIDLNLIKCH 

FKKMYGKTLSSMIMEDTSGDYKNALLSLVGSDP 

3296 

A 

1 

838 

GTRGGVGPGDNGGVEAGAKPGAAAIPLRGDGS 

GETGPGRVAPGEVRGSPRGHVAGPEGPREVLFFF 

FLPSSKPASEVINEYSWKVDFLKGMLQAEKLTSS 

SEKALANQFLAPGRVPTTARERVPATKTVHLQS 

RARYTSEMRSELLGTDSAEPEMDVRKRTGVAGS 

QPVSEKQSAAELDLVLQRHQNLQEKLAEEMLGL 

ARSLKTNTLAAQSVIKXDNQTLSHSLKMADQNL 

EKLKTESERLEQHTQKSVNWLLWAMLIIVCFIFIS 

MILFIRIMPKLK 

3297 

A 

46 

617 

HKQPAGFLGLWLGTETYTISFPGPETFGLGLSHA 

TGIPGSPACRQPVVGLHSLHNYRMAMVSAMSW 

VLYLWISACAMLLCHGSLQHTFQQHHLHRPEGG 

TCEVIAAHRCCNKNRIEERSQTVKCSCLPGKVAG 

TTRNRPSCVDASIVIGKWWCEMEPCLEGEECKTL 

PDNSGWMCATGNKIKTTRIHPRT 

3298 

A 

157 

748 

IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLAD 

PLNKSSYKYEADTVDLNWCV1SDMEVIELNKCT 

SGQSFEVILKPPSFDGVPEFNASLPRRRDPSLEEIQ 

KKLEAAEERRXYQEAELLKHLAEKREHEREVIQ 

KAIEENNNFIKMAKEKLAQKMESNKENREAHLA 

AMLERLQEKDKHAEEVRKNKELKEEASR 

3299 

A 

5 

892 

TQLPAPLSGVLSRLQLGSGAPLLTWVQETAGVA 

GGAPRRRTPVTMWRLLARASAPLLRVPLSDSWA 

LLPASAGVKTLLPVPSFEDVSIPEKPKLRFIERAPL 

VPKVRREPKNLSDIRGPSTEATEFTEGNFAILALG 

GGYLHWGHFEMMRLTINRSMDPKNMFAIWRVP 

APFKPITRKSVGHRMGGGKGAIDHYVTPVKAGR 

LWEMGGRCEFEEVQGFLDQVAHKLPFAAKAVS 

RGTLEKMRKDQEERERNNQNPWTFERIATANML 

GIRKVLSPYDLTHKGKYWGKFYMPKRV 

3300 

A 

2 

1847 

FVAGGPRGSGSAAETMPEIRVTPLGAGQDVGRS 
CILVSIAGKNVMLDCGMHMGFNDDRRFPDFSYI 
TQNGRLTDFLDCVIISHFHLDHCGALPYFSEMVG 
YDGPIYMTHPTQAICPILLEDYRKXAVDKKGEAN 
FrTSQMIKDCMKXVVAVHLHQTVQVDDELEIKA 


303 


WO 01/57190 


PCT/US01/04098 


SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidtnc, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methiontne, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine,S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





YYAGHVLGAAMFQIKVGSESVVYTGDYNMTPD 

RHLGAAWIDKCRPNLLITESTYATTIRDSKRCRE 

RDFLKKVHETVERGGKVLIPVFALGRAQELCILL 

ETFWERMNLKVPIYFSTGLTEKANHYYKLFIPWT 

NQIGRKTFVQRNMFEFKH1KAFDRAFADNPGPM 

VVFATPGMLHAGQSLQIFRKWAGNEKNMVIMP 

GYCVQGTVGHKILSGQRKLEMEGRQVLEVKMQ 

VEYMSFSAHADAKGIMQLVGQAEPESVLLVHGE 

AKKMEFLKQKIEQELRVNCYMPANGETVTLPTS 

PSIPVG1SLGLLKREMAQGLLPEAKKPRLLHGTLI 

MKDSNFEU.VSSEQALKELGLAEHQLRFTCRVHL 

HDTRKEQETALRVYSHLKSVLKDHCVQHLPDGS 

VTVESVLLQAAAPSEDPGTKVLLVSWTYQDEEL 

G SFLTSLLKKGLPQ APS 

3301 

A 

2 

349 

CIRTEPA AAFRRLG ALSG A AALG FAS YG AHG AQ 
FPDAYGKELFDKANKHHFLHSLALLGVPHCRKP 
LWAGLLLASGTTLFCTSFYYQALSGDPSIQTLAP 
AGGTLLLLGWLALAL 

3302 

A 

59 

1184 

LRRNCSALGGLFQTIISDMKGSYPVWEDF1NKAG 

KLQSQLRTTVVAAAAFLDAFQKVADMATNTRG 

GTREIGSALTRMCMRHRSIEAKLRQFSSALIDCLI 

NPLQEQMEEWKKVANQLDKDHAKEYKJCARQEI 

KKKSSDTLKLQKKAKKGRGDIQPQLDSALQDVN 

DKYLLLEETEKQAVRKALIEERGRFCTFISMLRP 

VIEEEISMLGEITHLQTISEDLKSLTMDPHKLPSSS 

EQVILDLKGSDYSWSYQTPPSSPSTTMSRKSSVC 

SSLNSVNSSDSRSSGSHSHSPSSHYRYRSSNLAQQ 

APVRLSSVSSHDSGFISQDAFQSKSPSPMPPEAPN 

QRRKEKREPDPNGGGPTTASGPPAAAEEAQRPRS 

M 

3303 

A 

511 

958 

AGRGGPGKPVSWSSGPGSPGQTQRRSWVKSTRG 
HSSLLPPSQDFVAGLSVILRGTVDDRLNWAFNLY 
DLNKDGCITKEEMLDIMKSIYDMMGKYTYPALR 
EEAPREHVESFFQKMDRNKDGVVTIEEFIESCQK 
DENIMRSMQLFDNVI 

3304 

A. 

40 

432 

ISEAASGAFQAR*FYQM\LEQKTDALGKQSVNRG 
FTKDKTLSSIFNIEMVKEKTAEEIKQIWQQYFAA 
KDTVYAVIPAEKFDLIWNRAQSCPTFLCALPRRE 
GYEFFVGQwTGTELHFHCTYKYSDPEGKA 

3305 

A 

2 

483 

LDACSTGPYSRSTHASADAWADAWVVVVLKVV 
GMTLFLLYFPQIFNKSNDGF 111 RS YGTVSQIFG S 
RSPSPNGFITTRSYGTVCPKDWEFYQARCFFLIHL 
*\SSWNESWDFCKGKGCTLAIVDNSETLKLLHDL 
rlDAEKNTYIALPYRSSKYMSTCNGTF 

3306 

A . 

2 

872 

TLSSACLIGDAWKELTIVAGAVSNQLLVWYPAT 
ALADNKPVAPDRRISGHVGIIFSMSYLESKGLLA 
TASEDRSVRJWKGGDLRVPGGRVQN1GHCFGHS 
ARVWQVKLLENYLISAGEDCVCLVWSHEGEILQ 
AFRGHQGRGIRAIAAriERQAWVITGGDDSGIRL 

XT7TTT A 7 /~*T> PVTD^T /""» fT\J PCI I /^\\/r>** A r> WTVAP/TXC 

WHLVOKGYRGLO/DLObLLQVr^^ARY 1 QUCJDo 
GWLLATAGSD*YRGPVSL*RRGQVLGAAARG*T 
FPVLLPAGGSSWSRGLR1VCYGQWGRSCQGCPH 
QHSNCCCGPDPVSWEGAQLELGPAWL 

3307 

A 

2 

927 

RTSRVEKGLRKAGAAVTMESDEWFSQALPANTS 
AQKAEL1ALTQA1RWGKDINVNTDSRYAFATVH 


304 


WO 01/57190 


PCT/US01/04098 


SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, AV=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 





VRGAICQERRLLTSAEKAIKNKNPPSSKPNRSSSXF 
WGTTCDQVNAKQGPKPSPGHRLRRNLPGEKWEI 

lYFTKVKPHnAOYKYT T VT VOTF^fiWTP A F A TV 
xjx. i iv v ivrnynVj i is. i i^lj v l* v xj x r ovj w l E*j\rf\ I IV 

NETVNMVVKFLLNEIIPRHGLPVAIGSDNGPAFA 

LS1V*SVSKALNIQWKLHCAYRPQSSGQVERMNC 

TLKNTLTKLILETGVNWVSLLPLALLRVRCTPYW 

AGFLPFEIMYGRVLPILPKLRDAQLAKISQTNLLQ 

YLQSP 

3308 

A 

490 

1077 

DKMniMFniPTFT ^n^^nTHDPrrPvnAWP 
in oiOLur in L/iNnLyix i CjL^oijoou i xtixJcjKji^ v v^/\r i H, 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 

3309 

A 


1077 

in or oLjUF in u in EtUir l JC/JL»oJL/ooU l riJL/riLjJLi V v^Ar it 

DLSGRQYVNEVFNFSVDKLYDLLFTNSPFQRDF 

MEQRRFSDIIFHPWKKEENGNQSRVIPYTITLTNP 

LEHKTATVRETQTMYKASQESECYVIDAEVLTH 

DVPYHDYFYTINRYTLTRVARNKSRLRVSTELRY 

RKQPWGLVKTFIEKNFWSGLEDYFRHL 

3310 

A 

2 

1198 

SPLCHPGLSRER/S* SEAKLRSGRYC*KRQVEAPL 

*RPGL*TMAASDTERDGLAPEKTSPDRDKKKEQS 

EVSVSPRASKJHHYSRSRSRSRERKRKSDNEGRJCH 

RSRSRSKEGRRHESKDKSSKKHKSEEHNDKEHSS 

DKGRERLNSSENGEDRHKRKERKSSRGRSHSRS 

RSRERRHRSRSRERKKSRSRSRERKKSRSRSRER 

fcfl^QP QDCRPT? IfPPTPCPCP CDCDUPUDTD CDCDTD 

SRSRDRKKRIEKPRRFSRSLSRTPSPPPFRGRNTA 

MDAOFAT ARPT FPAKK1 nFOPFKFA/fVFK'nkrnn 

EIAAAAAATGGSVLNVAALLASGTQVTPQIAMA 
AQMAALQAKALAETGIAVPSYYNPAAVNPMKF 
AEQEKKRKMLWQGKKEGDKSQSAGNMGKN 

3311 

A 

177 

4 

PIQIPPRITPPRPSPHLLTPRTGSSPPPPRAPSPPHPT 
POPAHDFPPT <JAVF ^HHTI^T 

3312 

A 

3 

426 

LESPRH*PPCWGPLIWALTVSSVPSPTPELSCILKS 

P/RPAPPV/POI T ^PAPPO^^riPT T OT QPrPH 

AGQWPSPLSPAPPPSSDPLSGLSPCPGAGPRSSP\S 
ASAPCRAVPLSPRRLTWPPHLOVGrLrPTGRPWK 
NL 

3313 

A 

162 

2 

QLQNLASRGCL*SQLLRRLRRENRLNPGGGGCSE 
IAP\CTPAWVTQRDFFRKKK 

3314 

A 

162 

2 

QLQNLASRGCL* SQLLRRLRRENRLNPGGGGCSE 
IAPVCTPAWVTORDFFRKKK 

3315 

A 

466 

1 

PRKRESWWGERLP/PRGFPPAAEDAPAPGWKGR 
KHASRTARAHVFHPIRQSIRSPVRGRPGDPRAAH 
TRSAGTRLQCKASRGG*GKGPAPTR*EGGPGSAP 
APLPASSGCSLFPDSSPWTPPPPAPGAAAAQP**T 
PRCPAALRAGAHIGRVGRPY 

3316 

A 

3 

2307 

NHLGTLMQNWDSSSRVPFSSGQHSTQSFPPSLMS 

KSNSMLOKPTyAYVI^MDGOESMEPKLSSEHYSS 

QSHGNSMTELKPSSKAHLTKLKIPSQPLDASASG 

DVSCVDEILKEMTHSWPPPLTAIHTPCKTEPSKFP 

FPTKESQQSNFGTGEQKRYNPSKTSNGHQSKSM 

LKDDLKLSSSEDSDGEQDCDKTMPRSTPGSNSEP 

SHHNSEGADNSRDDSSSHSGSESSSGSDSESESSS 
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NO: 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno>vn, *=Stop codon,/=possibie nucleotide deletion, 
\=possib!e nucleotide insertion 





SDSEANEPSQSASPEPEPPPTNKWQLDNWLNKV 

NPHKVSPASSVDSNIPSSQGYKKEGREQGTGNSY 

TDTSGPKETSSATPGRXAPKPIQKGSESGRGRQKS 

PAQSDSTTQRRTVGKKQPKKAEKAAAEEPRGGL 

KIESETPVDLASSMPSSRHKAATKGSRKPNIKKES 

KSSPRPTAEKKKYKSTSKSSQKSREIIETDTSSSDS 

DESESLPPSSQTPKYPESNRTPVKPSSVEEEDSFFR 

QRMFSPMEEKJELLSPLSEPDDRYPLIVKIDLNLLT 

RIPGKPYKETEPPKGEKKNVPEKHTREAQKQASE 

KVSNKGKRKHKNEDDNRASESKKPKTEDKNSA 

KTEHGSRKRTISQSSSLKSSSNSNKETSGSSKNSS 
STSKQKKTEGKTSSSSKEVKVKAPSSSSNCPPSAP 
TLDSSKPRRTKLVFDDRNYSADHYLQEAKKLKH 
NADALSDRFEKAVYYLDAVVSFIECGNALEKNA 

nP^foT^PFPMVQPTVriT T 

3317 

A 

496 

2 

NLLQDEKLVHSYPYDWRTQETCGY1VPARQWFI 
N\TRDIKTA AKELLKKVKFIPG S ALNGMVEMMD 
RRPYWCISRQRVWGVPIPVFHHKTKDEYLINSQT 
TEHIVKLVEQHGSDrWWTLPPEQLLPKEVLSEVG 
GPDALEYVPGQDILDIWFDSGTSWSYVLPGPD 

11 1 Q 
55 10 

A 

A 

Z 

J lz 

AWrlbODbKSDQCHHPYNYGFDYYYGMPFTLVD 
SCWPDPSRNTELAFESQLWLCVQLVAIAILTLTF 
GKLSGWVSVPWLL1FSMILF1FLLGYAWFSSHTSP 
LYWDCLLMRGHEITEQPMKAEXRAGSIMVKEAIF 

AP 

3319 

A 

407 

1 

SSLHRSPRPASPLPVPEAP\SFLPVPAPKPSALPPFS 
LSGAPSSASTFSPHSSPSPASPTPAPSPQSPFPSRPT 
SPPSLTPTRRPPLPADRRGPHLLYQPLHAPLEAAA 
TGPE/PSAAAGRLPRPRPPWRAAYPASR 


A 



0\/fQP A V A PVlV/fT nVDPnTA^U/VIPD trr^XT/^A/CA/C 
^IVlbilA V AxirvMLy Y KJvU 1 Avj W IVlUKbLjJNO Vovb 

WRPSVEFPGNLYRGEGIVYGTLEEVWDCVKPAV 
GGLRVKWDENVTGFEIIQSITDTLCVSRTSTPSAA 
MKLISPRDFVDLVLVKRYEDGTISSNATHVEHPL 

PPPkTPnPVPnPMT-TPr'rjPPPPPT PrjPPTfcTTKFT vtpp 

HTDLSGYLPQNWDSFFPRSMTRFYANLQKAVK 

3321 

A 

37 

360 

SHSASGAGRPAAPAADLRPAPNGQRPGPRLGAR 
ALWLPPRGRPDEAGRLPGEHLPQVPWDPGLTRS 
PSPRGPCRGAARAGHVGETPAPWGCPPPCAWEH 

3322 

A 

1 

420 

AIVEDKHSGRSYDITSDLGNVLTSTS1AKTVNG*A 

ESSDSGAESDEEDAQEDLMGAYHSDIDKKMMKI 

VADHKNLEVIVTNGYDKDGFVHDIQNDIHASSSL 

NGRSTVHVK^roENLGQTGKSAVCfflQDINDDH 

vpnvr 

VHxJ V 1 

3323 

A 

8 

459 

DTLSLNCTLPETLPMTPSF*LSFL*FPGLARAKSIP 
TKTYSNEVVTLWYRPPDILLGSTDYSTQIDMW*G 
QVEVWQGPCGKGGGLVTTATQPAAFLFTVPSLP | 
RG VGCIF YEM ATGR PLFPGSTVEEOI HFIFR ILSF 

XV VJ V vJ v_ - 1 1 X X-*lVLr\ 1 VJ IVI X_*X X VJO X V JL/Ajs^JUx JUL XJL XVJ 1 (OX-/ 

EAWALCAVETHR 

3324 

A 

1276 

466 

PGSTHASARITIY*L*IILSNATEVDNNFSKPPPFFP 
AGAPPASSSSSSSSSSPPTVSTAPPLBPPPGFPPPPG 
APPPSLIPTIESGHSSGYDSRSARAFPYGNVAFPH 
LPGSAPSWPSLVDTSKQWDYYARSSSSSSSSSSSS 
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SEQ n> 

NO: 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to First amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, 0=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K-Lysine, l^=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V- Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possibJe nucleotide deletion, 
\-possibIe nucleotide insertion 





SSSPRDRDRER*RTRERERERDHSPTPSVFNSDEE 
RYRYREYAERGYERHRASREKEERHRERRHREK 
EETRHKSSRSNSRRRHESEEGDSHRRHKHKKSKR 
SKEGKEAGSEPAPEQESTEATPAE 

3325 

A 

266 

3312 

TCLFSASCSSLPSPSSSFALLSTENTQRTYRVNPD 

GSLRVTFASGN4EIGLSSEPHILAGAVNPTLGKCNI 

SLPGEHNANLISVL**GEQGCA*NVFHISFS*AHN 

RNLLSIDFDHITRTGKIYDDHRKFTLR1LYDQTGR 

PILWSPVSRYNEVNITYSPSGLVTFIQRGTWNEK 

MEYDQSFL*SPQL*LSIICYSAFVSFQSVMLLLHS 

QRRYIFEYDQPDCLLSVTMPSMVRHSLQTMLSV 

GYYRNIYTPPDSSTSFIQDYSRDGRLLQTLHLGTG 

RRVLYKYTKQARLSEVLYDTTQVTLTYEESSGD \ 

LSDSSTLIA*LLTVFVLVPAGPLIGRQIFRFSEEGL 

VNARFDYSYNNFRVTSMQAVINETPLPIDLYRYV 

DVSGRTEQFGKFSVINYDLNQVITTTVMKHTKIF 

SANGQVIEVQYEILKAIAYWMTIQYDNVGRMVI 

CDIRVGVDANITRYFYEYDADGQLQTVSVNDKT 

QWRYSYDLNGNINLLSHGKSARLTPLRYDLRDRI 

TRLGEIQYKMDEDGFLRQRGNDIFEYNSNGLLQ 

KAYNKASGWTVQYYYDGLGRRVASKSSLGQHL 

QFFYADLTNPIRVTHLYNHTSSEITSLYYDLQGH 

LIAMELSSGEEYYVACDNTGTPLAVFSSRGQVIK 

EILYTPYGDIYHDTYPDFQV1IGFHGGLYDFLTKL 

VHLGQRDYDVVAGRWTTPNHHIWKQLNLLPKP 

FNLSTKLIKYGIFHFLFLILCLTDIRSWLELFGFQL 

HNVLPGFPKPELENSPSPQMSNSMLHLLCASLS* 

TILGIQCELQKQLRNFISLDQLPMTPRYNDGRCLE 

GGKQPRrAAVPbVruKGlKJ^AirLUulV 1 AJDlluVA 

NEDSRRLAA1LNNAHYLENLHFTIEGRDTHYFIK 

LGSLEEDLVLIGNTGGRRILENGVNVTVSQMTSV 

LNGRTRRFADIQLQHGALCFNIRYGTTVEEEKNH 

VLEIARQRAVAQAWTKEQRRLQEGEEGIRAWTE 

GEKQQLLSTGRVQGYDGYFVLSVEQ 

3326 

A 

290 

1041 

KACLHLLSSFLTSNFLFNPLLPDSLYSVEARSQRA 
NLGPCRRKRLQTLMRLAAGFQYSSHKDPSLSAK 

r?iy'Tj ,r rT^\/i_rxTt? a t> / r ^'D\l/T>/^ , \i/\//^#DT'AT^f^Q/"" , fTR'/TDri 
bKrl 1 U Y rlN bAKur W ru W V kj * K A AJLXjoUUKvjr JL> 

GAHHPGPKSSSWRASRLLPGLGGSHHLDAYVGR 
DLECGTPAPLQLEIPPQPRGHPAP1PTGQAGPRDS 
GPGASP*VETRPLTDGRR*PGVRPVGWTPAHPAG 
TLRPRGAVEPSVSACGKWAPSPTSQGCCEGRCD 

/V V x iVriivrV W tv 1 x LLoy 

3327 

A 

1 

418 

CSECGKSFCKKSKFIIHQRTHTGEKPYECNQCGK 
SFCQKGTLTVHQRTHTGEKPYECNECGBCNFYQK 
t ht TnHnRTHQfiFifPVFrwrnit'^Fpni^THT toh 

JUnJUlv^rl vtv 1 rtovJillSJr I Cl^o I l^vJlVor v^v^jv l itl i v^n 

QRTHSGERPYVCHDCGKTFSQKSALNDHQKIHT 
GVKLY 

3328 

A 

1 

270 

VTRKLPIF1VDAFTARAFRGSPAADCLLENELDED 
IWFTPTTDLOILTSSILPSIL 

3329 

A 

45 

419 

EELSCWQIWQQIANDLTRCQDSMINNSQCHKQG 
DFPYQVGTELSIQISEDENYIVNKADGPNNTGNP 
EFPILRTQDSWRKTFLTESQRLNRDQQISIKNKLC 
QCKKGVDPIGWISHHDGHRVHKR 

3330 

A 

64 

430 

FWRNFTGLAPAAAVATTTSSSTMRFTSISNSLTST 
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NO: 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





AAIGLSFTTSTTTTATFTTNTTTTITSGFTVNQNQ 
LLSRGFENLVPYTSTVSVVTTPVMTYGHLEGLIN 
EGNLELEIKRRLSSQATQ 

3331 

A 

3 

407 

TFGCSCTDCFFQKCCPAEAGVLLAYNKNQQIKIP 
PGTPIYECNSRCQCGPDCPNRIVQKGTQYSLCIFR 
TSNGRGWGVKTLVKIKRMSFVMEYVGEVITSEE 
AERRGQFYDNKGITYLFDLDYESDEFTVDAARY 

3332 

A 

25 

461 

PAADFVLQARPTRADILGIHSKYDEVRKAGACFY 

mTGLGPGPQALYNGEPFKHEEMNIKELKMAVL 

QRMMDASVYLQREVFLGTLNDRTNAIDFLMDR 

WWVPRINTLIL^ 

TFFFLDSQDKSA 

3333 

A 

317 

54 

AWIIFLPPLTSCPLWAPGTKHKTILEARSGLGPIK 
AYPRLGPPTPGEPEAPAQDRTFHCEICNVKVNSK 
VQLKQHISSRRHEIVDPV 

3334 

A 

304 

410 

AGPSLPSNLRQIFQSLPPFMDILLLLLFFMIIFAI 

3335 

A 

19 

418 

VESRNSRVQPRVRLNDRTNAIDFLMDRNNVVPRJ 
NTLILRTNQQYLNLISTSVTADVEDFSTFFFLDSQ 
DKSAVIAKNMYYLTQDDESIISAATLWIIADFDK 
PSGRKLLFNALKHMITSVHSRVGIIYNPFF 

3336 

A 

I 

1003 

PSSYSSDELSPGEPLTSPPWAPLGAPERPEHLLNR 

VLERLAGGATRDSAASDILLDDIVLTHSLFLPTEK 

FLQELHQYFVRAGGMEGPEGLGRKQACLAMLL 

HFLDTYQGLLQEEEGAGHIIKDLYLLIMKDESLY 

QGLREDTLRLHQLVETVELKIPEENQPPSKQVKP 

LFRHFRRIDSCLQTRVAFRGSDEIFCRVYMPDHS 

YVTIRSRLSASVQDILGSVTEKLQYSEEPAGREDS 

LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFAC 

TRDSYEALVPLPEEIQVSPGDTEIHRVEPEDVANH 

LTAFHWELFRCVHELEFVDYVFHGE 

3337 

A 

444 

43 

KILLCLANQFPDISFCPALPAVVALLLHYSIDEAE 
CFEKACRILACNDPGRRLIDQSFLAFESSCMTFGD 
LVNKYCQAAHKLMVAVSEDVLQVYADWQRWL 
FGELPLCYFARVFDVFLVEGYKVLYRVALAXXF 

3338 

A 

1 

398 

FRGKVRGRSAEMPGSDTALTVDRTYSDPGRHHR 
CKSRVERHDMNTLSLPLNIRRGGSDTNLNFDVPD 
GILDFHKVKLTADSLKQKILKVTEQIKIEQTSRDG 
NVAEYLKLVNNADKQQAGR1KQVFEKKNQK 

3339 

A 

1 

665 

AAAASNWGLITNIVNSIVGVSVLTMPFCFKQCGI 

VLGALLLVFCSWMTHQSCMFLVKSASLSKRRTY 

AGLAFHAYGKAGKN1LVETSMIGLMLGTC1AFYV 

VIGDLGSNFFARLFGFQVGGTFRMFLLFAYSLC1 

VLPLSLQRNMMASIQSFSAMALLFYWFMFVIVL 

SSLKHGLFSGQWLRRVSYVRWEGVFRCIPIFGMS 

F ACQSQ VLPTYDSLDEPS V 

3340 

A 

198 

367 

LLPLQVLQEAFSRCVAVLTRSSKPSDMSVQVCG 
YISKCYSVAAQFEECREK1TEMP 

3341 

A 

562 

277 

HSVDCRTPRKYLAEIVLDDDFSNKEHLKEKLDEYI 
KLWNGLVKVFRNERREGLIQARSIGAQKAKLGQ 

VT TVT H A UPC V A \/\T\l/V A DT UADlQVni? 

3342 

A 

385 

2 

NLTWWPLFRDVSFY1VDLIMLUFFLDNVIMWWE 
SLLLLTAYFCYVWMKFWQ\nEKWVKQMINRN 
KVVKVTAPEAQAKPSAARDKDErTLPAKPRLQR 
GGSSASLHNSLMRNSIFQNK1HTLDPHV 

3343 

A 

1 | 385 

FRVDNSEEWKDVFIISSERSFKLDSLKCGTWYKV 
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nucleotide 
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corresponding 

to first amino 

acid residue of 

peptide 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=P!ienylalanine, G=Glycine, H=Histidine, 
I=Isoieucine, K— Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





KLAAKNSVGSGRJSEIIEAKTHGREPSFSKDQHLF 
TH1NSTHARLNLQGWNNGGCPITAIVLEYRPKGT 
WAWQGLRANSSGEVFLTELREATWY 

3344 

A 

351 

147 

SPAC1TSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 

3345 

A 

351 

147 

SPACITSSLSQHIADPRAAPTEVKVRVMNSTAISL 
QWNRVYSDTVQGQLREYRVRKPAPDSPNYPAH 

3346 

A 

3 

1509 

AGIRHEAPPTTSNRHRRQIDRGVTHLNISGLKMP 

RGIAIDWVAGNVYWTDSGRDVIEVAQMKGENR 

KTLISGMIDEPHAIVVDPLRGTMYWSDWGNHPK 

IETAAMDGTLRETLVQDNIQWPTGLAVDYHNER 

LYWADAKJLSVIGSIRLNGTDPIVAADSKRGLSHP 

FSIDVFEDYIYGVTYINNRVFKIHKFGHSPLVNLT 

GGLSHASDVVLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVPVPSPTPPPD 

APRPGTCNLQCFNGGSCFLNARRQPKCRCQPRY 

TGDKCELDQCWEHCRNGGTCAASPSGMPTCRCP 

TGFTGPK CTOOVP A G YC ANN^TrTVNOGNOPO 

CRCLPGFLGDRCQYRQCSGYCENFGTCQMAAD 

GSRQCRCTAYFEGSRCEVNKCSRCLEGACVVNK 

OSGDVTCNCTDGRVAPSCLTCVGHCSNGGSCTM 

NSKMMPECQCPPHMTGPRCEEHVFSQQQPGHIA 

SIL1P 

3347 

A 

974 

666 

SPEMESHPITQAGVQWHHLSSLQPLPPGFK*FSCF 
SLPE*LGYRHVPPCLANSVFSVEMG\FLHVGQAG 
LELLTSGDLPALASQSAGITGVSHRARPENGFENIF 

3348 

A 

1 

1171 

LSKJTMPVICNEPLSFIQRLTEYM*HTYFIHRPSSL 

SDPVDRMQCVAAFAVSAVASQWERTGKPFNPLL 

GEWELVI^DLGFRLISEQVSHHPPISAFriAEGLN 

NDFIFHGSIYPKLKFWGKSVEAEPKGTITLELLEH 

NEAYTWTNPTCCVrlNIIVGKLWIEQYGNVEIINH 

KTGDKCVLNFKPCGLFGKELHKVEGYIQDKSKK 

Kl OALYGKWTFC1 YSVDPATFDAYKKNDKKNT 

EEKKNSKQMSTSEELDEMPVPDSESVFIIPGSVLL 

WRIAPRPPNSAQMYNFTSFAMVLNEVDKDMESV 

IPKTDCRLRPDIRAJVlENGEroOASEEKXRLEEKO 

RAARKNRSKSEEDWKTRWFHQGPNPYNGAQD 

Wl YSGS Y WDRNYFNLPDIY 

3349 

A 

403 

497 

NFASSSGKYLRTQKIKCLNlvnKJnrPFPTTE 

VRPP*SNRIY*ELQS*NISFS*LPN*NFASSSGKYLR 

TQK1KCLNNKFTPFPTTEKK 

3350 

A 

1 

712 

GAPAQDCICLPFPFHSSFLESDMCPARRKJQTTNP 

DFLLLLFMSVPVVSAPPFCPPAEGSRDGRPKASV 

ARPAAVHEHHSPRDCGHLPDVIRSSLGGWQPH*P 

AQPENRLL*LLPVE*GHQHPTVSPVP*AGSPGGAS 

GWPGPGQAWRVRVPGPHPLCPPASPPSPVQQ**E 

SVAAGSGLPGCVLCAAGRRPGPLPLLCVEVGQA 

LPPGAWVSSSGQRPGLTHPLAYSHGCVPSEG 

3351 

A 

1 

428 

MAAVVAATALKGRGARNARVLRGILAGATANK 

ASHNRTRALQSHSSPEGKEEPEPLSPELEYIPRKR 

GKNPMKAVGLAWAIGFPCGILL^ 

VKQMKARQN1V1RLSNTGEYESQRFRASSQSAPSP 

DVGSGVQT 

3352 

A 

2 

841 

RTLFRGRRRMDDR1SRPHPSTAESKAPTPKFDLL 
ASNFPPLPGSSSRlVxPGELVLENRMSDVVKGVYK 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine,K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrinc, 
T=Threonine, V=Valine,\V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 





EKDNEELTISCPVPADEQTECTSAQQLNMSTSSP 

CAAELTALSTTQQEKDLIEDSSVQKDGLNQTTrP 

VSPPSTTKPSRASTASPCNNNrNAATAVALQEPR 

KLSYAEVCQKPPKEPSSVLVQPLRELRSNVVSPT 

KNEDNGAPENSVEKPHEKPEARASKDYSGFRGN 

IIPRGAAGKIREQRRQFSHRAIPQGVTRRNGKEQ 

YVPPRSPK 

3353 

A 

1054 

587 

IATPTWTAPLTATPTPAHQYGPARVPNGAPRLEP 
PPGKRECRVGQYVVDLTSFEQLALPVLRNADCS 
SGPGQRVCVIDEIGKMELFSQLFIQAVRQTLSTPG 
TIILGTIPVPKGKPLALVEEIRNRKDVKVFNVTKE 
NRNHLLPDI VTC VQS SRK 

3354 

A 

56 

1268 

GMEPVGCCGECRGSSVDPRSTFVLSNLAEVVER 

VLTFLPAKALLRVACVCRLWRECVRRVLRTHRS 

VTWISAGLAEAGHLEGHCLVRVVAEELENVRJLP 

HTVL YMADSETFISLEEC RG H KRARKRTSMETA 

LALEKLFPKQCQ VLG I VTPGI V VTPM GSG SNRPQ 

EIEIGESGFALLFPQIEGIKIQPFHFIKDPKNLTLER 

HQLTEVGLLDNPELRVVLVFGYNCCKVGASNYL 

QQWSTFSDMNIILAGGQVDNLSSLTSEKNPLDI 

DASGVVGLSFSGHRIQSATVLLNEDVSDEKTAEA 

AMQRLKAANIPEHNTIGFMFACVGRGFQYYRAK 

GNVEADAFRKFFPSVPLFGFFGNGEIGCDRIVTG 

NFILRKCNEVKDDDLFHSYTTIMALIHLGSSK 

3355 

A 

1 

707 

GTSSGLGGDRLAAPGPSPPSFYPQGRGERAYDIY 

SRLLRERIVCVMGPIDDSVASLVIAQLLFLQSESN 

KKPIHMYINSPGGVVTAGLAIYDTMQYILNPICT 

WCVGQAASMGSLLLAAGTPGMRHSLPNSR1MIH 

QPSGGARGQATDIAIQAEEIMKLKJCQLYNIYAKH 

TKQSLQVIESAMERDRYMSPMEAQEFGILDKVL 

VHPPQDGEDEPTLVQKEPVEAAPAAEPVPAST 

3356 

A 

352 

338 

FNYNFCRNLHMPSFLV*PGMCGLLAKHLSFHIVG 
AFLIT/LG V A ALCKFA VA * PRKKA Y ADF YRN YN* 
IKEFEVRKANISQSTK 

3357 

A 

1 

403 

ALGSCGGLLGTGLLKGTMSGTLWSKG1FAGYKR 
RIRIQREHTAVLKIEGWYARDETEFYLRMICANV 
YKANNNTVTPVLTPDKTRVMWRKVTQAHGIS1 
MVRAQFRTNLPADAIGHRIRMML*PSRMYTTEPS 

3358 

A 

71 

2897 

FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHA 

VMDSERQVKDTDDffiSPKRSIRDSGYIDCWDSER 

SDSLSPPRHGRDDSFDSLDSFGSRSRQTPSPDVVL 

RGSSDGRGSDSESDLPHRKLPDVKKDDMSARRT 

SHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKK 

AEREEYRKSWSTATSPAGLGKKALQDYGPRT\PV 

S\DDAESTSMFDMRCEEEAAVQPHSRARQEQLQ 

LINNQLREEDDKWQDDLARWKSRKRSVSQDLIK 

KEEERKKMEKLLAGEDGTSERRKSIKTYRE1VQE 

KERRERELHEAYKNARSQEEAEGILQQYIERFTIS 

EAVLERLEMPKILERSHSTEPNLSSFLNDPNPMK 

YLRQQSLPPPKrnATVETTlARAS 

GSPSKTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 

VDGKVSVNGETVHREEEKERECPTVAPAHSLTK 

SQMFEGVARVHGSPLELKQDNGSIEINIKKPNSV 

PQELAATTEKTEPNSQEDKNDGGKSRKGNIELAS 

SEPQHFTTTVTRCSPTVAFVEFPSSPQLKNDVSEE 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenytalanine, G=Glycine, H=Histidine, 
J=Iso!eucine, K=Lysi.ne, D=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 





KDQKKPENEMSGKVELVLSQKVVKPKSPEPEAT 

LTFPFLDKMPEANQLHLPNLNSQVDSPSSEKSPV 

TTPFKFWAWDPEEERRRQEKWQQEQERLLQER 

YQ\KEQDK\LKEE\WEKAQK£VEEEERRYYEEEP* 

IIXEDPWPFTVSSSSADQLSTSSSMTEGSGTMNKI 

DRLEEKGSLTEGALAHSGNPVSKGVHEDHQLDT 
EAGAPHCGTNPQLAQDPSQNQQTSNPTHSSEDV 
KPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 
PT GKG A AMTIETLNLYFHIOCFRCGVICKGOLGDA 

1 J-»VJAVVJrVr\.i Villi-* 1 I.'l'lv 1 1 lllV^/^l IVV^VJ MV^IVVJ V^JLjnJ xJi\. 

VSGTDVRIRNGLLNCNDCYMRSRSAGQPTTL 

3359 

A 

3 

368 

EVTASREGRGACAWECGSSRGPWGLLRGTFAPV 
RAATP*S*LPKGSLRHRP*/CPPPVHLPPKSSCPPR 
AWAGRATSM*TSSYSSFYOPOTP*AT VTI PPRW 

r\ V V .fx VJ AVxi. 1 OlVl 1 k>kj 1 OOL> 1 v^I 1 1 ALf V 1 1_/JT I IvO I 

YLLTHLLTLTHLHHQILFEP 

3360 

A 

2 

392 

ARGIGSLGRDHSGSGGGTGMAGAWVRKAADYV 
RSKDFRDYLMSTHFWGPVANWGLPIAAITDMK\ 
KSPE1ISRRMTFAL*CYSLTFVRFAHYVQ\PWNWL 
MLGCHTAVDFDQLISSMPCISHGMTASASAL 

3361 

A 

4619 

532 

LLLGRANSPPYNSVVRTLPPATLLLRRAGWESF 

WSCQSRSPWPPRPEVRAPAKGPRGVAGAAGACS 

AGARLGDAAGGDPASGQAARGCGARAPRGLGR 

TARARDTAMEDAGAAGPGPEPEPEPEPEPEPAPE 

PEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 

DLAQQYADYYNTCFSDVCERMEELRKRRVSQD 

LEVEKPDASPTSLQLRSQIEESLGFCSAVSTPEVE 

RKNPLHKSNSEDSSVGKGDWKKKNKYFWQNFR 

KNQKGIMRQTSKGEDVGYVASEITMSDEERIQL 

MMMVKEKMITIEEALARLKEYEAQHRQSAALDP 

ADWPDGSYPTFDGSSNCNSREQSDDETEESVKF 

KRLHKLVNSTRRVRXKLIRVEEMKKPXSTEGGEE 

HVFENSPVLDERS AL YSG VHKKPLFFDGSPEKPP 

EDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGES 

RGLIKPPKKMGTFFS YPEEEKA QK VSRSLTEGEM 

KKGLGSLSHGRTCSFGGFDLTNRSLHVGSNNSDP 

MGKEGDFVYKEVIKSPTASRISLGKKVKSVKET 

MRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQPD 

PEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVS 

TTDSSTSNRESVKSEDGDDEEPPYRGPFCGRARV 

HTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMG 

LLNNKVGTFNFIYVDVLSEDXEEKPKRPTRRRRK 

GRPPQPKSVEDLLDRINLKEHMPTFLFNGYEDLD 

TFKLLEEEDLDELNIRDPEHRADLLTAVELLQEY 

DSNSDQSGSQEKLLVDSQGLSGCSPRDS*CYESS 

ENLENGKTRKASLLSAKSSTEPSLKAFSRNQLGN 

YPTLPLMKSGDALKQGQEEGRLGGGLAP\DTSKS 

CDPPGC*LVLN\KNRJRXPPSFPSCRSC\ETL\EGPQ 

TVDTWPRSHSLDDLQVEPGAEQDVPTEVTEPPPQ 

IVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQS 

KRFSEPOKLTTKKLEGSIA A SGRGLSPPOCLPRNY 

DAQPPGAKHGLARTPLEGHRKGlffiFEGTHHPLG 

TKEGVDAEQRMQPKIPSQPPPVPAKKSRERLANG 

LHPVPMGPSGALPSPDAPCLPVKRGSPASPTSPSD 

CPPALAPRPLSGQALGSPPSTRPPPWLSELPENTS 

LQEHGVKLGPALTR\KVSCARGVDLETLTE>TKL\ 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylatanine, G=G!ycine, HHHistidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G1utamine, R=Arginine, S=Serine, 
T^Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, possible nucleotide deletion, 
\=possible nucleotide insertion 





HAEGIRSSRREPYS*LRHGRCGI\P\EALVQRYAED 

LDQPERDVAANMDQIRVKQLRKQHRMAIPSGGL 

TEICRKPVSPGCIS\SVSDWLISIGLPMYAGTLSTA 

GFSTL\SQVPSLSHTCLQEAG\ITEERH1RK\LLSAA 

RLFKLPPGPEAM 

3362 

A 

1 

4653 

FRGGVGYAHTLHLLPFAGSSVVLARARRTDRWT 

SGLVEMATLSLTVNSGDPPLGALLAVEHVKDDV 

SISVEEGKENILHVSENVIFTDVNSILRYLARVAT 

TAGLYGSNLMEHTEIDHWLEFSATKLSSCDSFTS 

TINELNHCLSLRTYLVGNSLSLADLCVWATLKG 

NAAWQEQLKQKKAPVHVKRWFGFLEAQQAFQS 

VGTKWDVSTTKARVAPEKKQDVGKFVELPGAE 

MGKVTVRFPPEASGYLHIGHAKAALLNQHYQV 

NFKGKLIMRFDDTNPEKEKEDFEKVILEDVAML 

HIKPDQFTYTSDHFETIMKYAEKLIQEGKAYVDD 

TPGEQIKAEREQR1ESKHRKNPIEKNLQMWEEMK 

KGSQFGHSCCLRAKIDMSSNNGCMRDPTLYRCK 

IQPHPRTGN*Y\NV\YPTYDFACPIVDSIEGVTHAL 

RTTEYHDRDEQFYWIIEALGIRKPYIWEYSRLKL 

NNTVLSKRKLTWFVNEGLVDGWDDPRFPTVRG 

VLRRGMTVEGLKQFIAAQGSSRSVVNMEWDKI 

WAFNKKVIDPVAPRYVALLKKEVIPVNVPEAQE 

EMKEVAKHPKNPEVGLKPVWYSPKVFIEGADAE 

TFSEGEMVTFINWGNLNITKIHKNADGKIISLDAK 

LNLENKDYKXTTKVTWLAETTHALPIPVICVTYE 

HLITKPVLGKDEDFKQYVNKNSKHEELMLGDPC 

LKDLKKGDIIQLQRRGFFICDQPYEPVSPYSCKEA 

PCVLIYIPDGHTKEMPTSGSKEKTKVEATKNETS 

APFKERPTPSLNNNCTTSEDSLVLYNRVAVQGD 

VVRELKAKKAPKEDVDAAVKQLLSLKAEYKEK 

TGQEYKPGNPPAEIGQNISSNSSASILESKSLYDE 

VAAQGEVVRKLKAEKSPKAKINEAVECLLSLKA 

QYKEKTGKEYIPGQPPLSQSSDSSPTRNSEPAGLE 

TPEAKVLFDKVASQGEVVRKLKTEKAPKDQVDI 

AVQELLQLKAQYKSLIGVEYKPVSATGAEDKDK 

KKKEKENKSEKQNKPQKQNDGQRKDPSKNQGG 

GLSSSGAGEGQGPKKQTRLGLEAKJOEENLADW 

YSQVITKSEMIEYHDISGCYILRPWAYAIWEAIKD 

FFDAEIKKLGVENCYFPMFVSQSALEKEKTHVA 

DFAPEVAWVTRSGKTELAEPIAIRPTSETVMYPA 

YAKWVQSHRDLPnCLNQWCNVVRWEFKHPQPF 

LRTREFLWQEGHSAFATMEEAAEEVLQILDLYA 

QVYEELLAIPWKGRKTEKEKFAGGDYTTTIEAF 

ISASGRAIQGGTSHHLGQNFSKMFEIVFEDPKIPG 

EKQFAYQNSWGLTTRTIGVMTMVHGDNMGLVL 

PPRVACVQVVIIPCGITNALSEEDKEALIAKCNDY 

RRRLLSVNIRVRADLRDNYSPGWKFNHWELKG 

VPIRLEVGPRDMKSCQFVAVRRDTGEKLTVAEN 

EAETKLQAILEDIQVTLFTRASEDLKTHMWANT 

MEDFQKILDSGKIVQIPFCGEIDCEDWIKKTTARD 

QDLEPGAPSMGAKSLCIPFKPLCELQPGAKCVCG 

KNPAKYYTLFGRSY 

3363 

A 

3797 

1514 

LGGAAPETMPFPVTTQGSQQTQPPQKHYGITSPIS 
LAAPKETDCVLTQK\LI\ETLKPFGGFLKKEEGTA 
SRRNFNFGKN* INLVKE WIRRNQ*KAKNLPQS VI\ 


312 


WO 01/57190 


PCT/US01/04098 


SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenyla!anine T G=G!ycine, H=Hist!dine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 





ENVXGGKIFT/FLGSYRL/GEVHTKGADIDGVCVF 

APRHVDRSDFFT\SFYDKLKLQEEVKDLRAVEEA 

FVPV1KLCFDGIEIDELFARLALQTIPEDLDLRDDS 

LLKNLDIRCIRSLNGCRVTDEILHLVPNIDNFRLT 

LRAIKLWAKRHNIYSN1LGFLGGVSWAMLVART 

CQLYPNAIASTLVHKFFLVFSKWEWPNPVLLKQP 

EECNLNLPVWDPRVNPSDRYHLMPI1TPAYPQQN 

STYNVSVSTRMVMVEEFKQGLAITDEILLSKAE 

WSKLFEAPNFFQKYKHYIVLLASAPTENQRLEW 

VGLVESKIRILVGSLEKNEFITLAHVNPQSFPAPK 

ENPDKEEFRTM WVIGL VFKKTEN SENLS VDLTY 

DIQSFTDTVYRQAINSKMFEVDMKIAAMHVKRK 

QLHQLLPNH VLQKKKKHSTEG VKLTA LNDSSLD 

LSMDSDNSMSVPSPTSATKTSPLNSSGSSQGRNS 

PAPAVTAASVTNIQATEVSVPQVNSSESSGGTSSE 

SIPQTATQPAISPPPKPTVSRVVSSTRLVNPPPRSS 

GNAATSGNAATKIPTPIVGVKRTSSPHKEESPKK 

TKTEEDETSEDANCLALSGHDKTEAKEQLDTETS 

TTQSETIQTAASLLASQKTSSTOLSDIPALPANPIP 

VIKN S IKLRLN R 

3364 

A 

54 

3073 

SARTMSYDYHQNWGRDGGPRSSGGGYGGGPAG 

GHGGNRGSGGGGGGGGGGRG/WQGPASRAPER 

PRNRHVVREKTGAEEQ/WKRRGKREL/LVHMDE 

RREEQIVQLLNSVQAKNDKESEAQISWFAPEDHG 

YGTEVSTKNTPCSENKLDIQEKKLINQEKKMFRI 

RNRSYIDRDSEYLLQENEPDGTLDQKLLEDLQKK 

KNDLRYIEMQHFREKLPSYGMQKELVNLIDNHQ 

VTVISGETGCGKTTQVTQFILDNYIERGKGSACRI 

VCTQPRRISAISVAERVAAERAESCGSGNSTGYQI 

RLQSRLPRKQGSILYCTTG1ILQWLQSDPYLSSVS 

HIVLDEIHERNLQSDVLMTVVKDLLNFRSDLKVI 

LMSATLNAEKFSEYFGNCPMIHIPGFTFPVVEYLL 

EDVIEKIRYVPEQKEHRCQFKRGFMQGHVNSQE 

KEEKEAIYKERWPDYVRELRRRYSASTVDVIEM 

MEDDKVDLNLIVALIRYIVLEEEDGAILVFLPGW 

DNISTLHDLLMSQVMFKSDKFLIIPLHSLMPTVN 

QTQVFKRTPPGVRKIVIATNIAETSITIDDVVYVID 

GGKIKETHFDTQNNISTMSAEWVSKANAKQRKG 

RAG\RVQPGSLLFICINGS*EASLLGWTIQLPEIF/R 

GTPLEELCLQIKVLRLGGI/GLFLSRLMDPPSNEA 

VLLSIRQLVRSLNALDKQEELTPLGVHLARLPVEP 

fflGKMILFGALFCCLDPVLTIAASLSFKDPFVIPLG 

KEKIADARRKELAKDTRSDHLTV VN AFEG WEE A 

RRRGFRYEKDYCWEYFLSSNTLQMLHNMKGQF 

AEHLLGAGFVSSRNPKDPESNINSDNEKJIKAVIC 

AGLYPKVAKIRLNLGKKRKMVKVYTKTDGLVA 

VHPKSVNVEQTDFHYNWLIYHLKMRTSSIYLYD 

CTEVSPYCLLFFGGDISIQKDNDQETIAVDEWIVF 

QSPARIAHLVKRAVVHMDERREEQIVQLLNSVQ 

AKNDKESEAQISWFAPEDHGYDKKYFFKE 

3365 

A 

439 

878 

ECC1WRPLRETDLLKMKRKPRASSPVVEEQPRA 

NTKETRKKKSFSQPMSASTKEESQDGRRKGK*L 

KGRARKKNAPQKSMALRILEEGSRPTPSGHSDQL 

NEEL*QNELQLEQ/PEGT*LEQQSEGTQPEQQSGR 

MPTISTLSLSSE 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanioe C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 

3366 

A 

1 

827 

FRGYWGVREAFTDASWSGGLGPGKPGMKITRQ 

KHAKKHLGFFRNNFGVREPYQILLDGTFCQAAL 

RGR1QLREQLPRYLMGETQLCTTRCVLKELETLG 

KDLYGAKLIAQKCQVRNCPHFKNAVSGSECLLS 

MVEEGNPHHY FVATQDQNLSVKVKKKPG VPLM 

FDQNTMVLDKPSPKTIAFVKAVESGXRLSQCMRK 

KVSNISKKNRV**KTLNRGRRKKRKKISGPNPLS 

CLKKKKKAPDTQSSASEKKRKRKRIRNRSNPKV 

LSEKQNAEGE 

3367 

A 

40 

1467 

MLWGCRAKACWGPRLSDLVASLSPQRECISVHV 

GQAG VQIGN AC WELFCLEHG 1QADGTFD AQ ASK 

INDDDSFTTFFSETGNGKHVPRAVM1DLEPTVVD 

EVRAGTYRQLFHPEQLITGKEDAANNYARGHYT 

VGKESIDLVLDRIRKLTDACSGLQGFLIFHSFGGG 

TGSGFTSLLMERLSLDYGKKSKLEFAIYPAPQVS 

TAVVEPYNSILTTHTTLEHSDCAFMVDNEAIYDI 

CRRNLDIERPTYTNLNRLISQIVSSITASLRFDGAL 

NVDLTEFQTNLVPYPRIHFPLVTYAPIISAEKAYH 

EQLSVAEITSSCFEPNSQMVKCDPRHGKYMACC 

MLYRGDVVPKDVNVAIAAIKTKRTIQFVDWCPT 

GFK VG IN YQPPT V VPGGDLAK VQRA VCMLSNTT 

AIAEAWARLDHKFDLMYAKRAFVHWYVGEGM 

EEGEFS*RPGEDLA\ALE\KDYEEVGTDSFEEENE 

GEEF 

3368 

A 

3 

2597 

SLLEETMDEDSSLREYTVSLDSDMDDASKCLQE 

YDSGTGNTREALRPCPRTVSTKAQPGRSASSSSG 

DKTTSFAEQKIRKLNHTDGESSGSSSQKTTPEGSE 

LNIPHAGAWAQIPEETGLPQGRDTTQLLASEMV 

HLMMK\LK£KR\RA1*AQKKKMEAAFTKQRQKM 

GRTAFLTVVKKKGDG ISPLREEAAG AEDEKVYT 

DRAKEKESQKTDGQRSKSLADIKESMENPQAKW 

LKSPTTPIDPEKQGNLASPSEETLNEGEILEYTKSI 

EKLNSSLHFLQQEMQRLSLQQEMLMQMREQQS 

WVISPPQPSPQKQIRDFKPSKQAGLSSAIAPFSSD\ 

SPR\PTHPSSTSLLNRKSASFSVKSQRTPRPNELKI 

TPLNRTLTPPRSVDSLPRLRRFSPSQVPIQTRSFVC 

FGDDGEPQLKESKPKEEVKKEELESKGTLEQRG 

HNPEEKEIKPFESTVSEVLSLPVTETVCLTPNEDQ 

LNQPTEPPPKPVFPPTAPKNVNLIEVSLSDLKPPE 

KADVPVEKYDGESDKEQFDDDQKVCCGFFFKD 

DQKAE>ODMAMKRAALLEKRLRREKETQLRKQQ 

LEAEMEHKKEETRRKTEEERQKKEDERARJUEFIR 

QEYMRRKQLKLMEDMDTVIKPRPQVVKQKKQR 

PKSMRDHDESPKTPIKGPPVSSLSLASLNTGDNES 

VHSGKRTPRSESVEGFLSPSRCGSRNGEKDWEN 

ASTTSSVASGTEYTGPKLYKEPSAKSNKHIIQNAL 

AHCCIJ^GKVNEGQKKKILEEMEKSDANNFLILF 

RDSGCQFRSLYTYCPETEEINKLTGIGPKSITKKM 

IEGLYKYNSDRKQFSHIPAKTLSASVDAITIHSHL 

WQTKRPVTPKKLLPTKA 

3369 

A 

977 

594 

RGSGLTQEPGSVGQLALACAEGAVEWLYPAGAL 
RLTLGGPDPRARPGIACLRPVRPFAGAQVFAERA 
GGALELLLAEGPGPAGGRCVRWGPRERRALFLQ 
ATPHQDISRRVAAFRFELREDGRPEIAP 

3370 

A 

345 

1383 

DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 
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SEQ n> 
NO: 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





YS A VLFPC* AMDHLESFIAECDRRTELAKKRLAE 

TQEEISAEVSAKAEKVHELNEE1GKLLAKAEQLG 

AFrVMVnPQmfTT IV/TPX/Fl^A/l? AKKKFAFKTVAFK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERIUCLSRSRSRDRHRRrlR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 

3371 

A 

345 

1383 

DLSLECTGFKETNLGVYFLSSKWVLRLYALHIID 
YSAVLFPC*AMDHLESFIAECDRRTELAKKRLAE 
TQEEISAEVSAKAEKVHELNEEIGKLLAKAEQLG 
Acr.MVnPQnk'TI tV/TFVFKVtt AKtCKFAFKTVAFK 

QEKRNQDRLRRREEREREERLSRRSGSRTRDRRR 

SRSRDRRRRRSRSTSRERRKLSRSRSRPRHRJRHR 

SRSRSHSRGHRRASRDRSAKYKFSRERASREESW 

ESGRSERGPPDWRLESSNGKMASRRSEEKEAG/G 

DLLNRMIVWKHGLLI 

3372 

A 

239 

3348 

PMQNCMCSLTLSVLPLGPQPPVPEKRPPEIQHFR 

MSDDVHSLGKVTSDLAKRRKLTS\*GGLSEELGS 

ARRSGEVTLTKGDPGSLEEWETVVGDDFSLYYD 

SYSVDERVDSDSKSEVEALTEQLSEEEEEEEEEEE 

EEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK 

KWRKDSPWVKPSRKRRKREPPRAKEPRGVNGV 

GSSGPSEYMEVPLGSLELPSEGTLSPNHAGVSND 

TSSLETERGFEELPLCSCRMEAPKIDRISERAGHK 

CMATESVDGELSGCNAAILKRETMRPSSRVALM 

VLCETHRARMVKHHCCPGCGYFCTAGTFLECHP 

DFRVAHRFHKACVSQLNGMVFCPHCGEDASEA 

QEVTIPRGDGVTPPAGTAAPAPPPLSQDVPGRAD 

TSQPSARMRGHGEPRRPPCDPLADTIDSSGPSLTL 

PNGGCLSAVGLPLGPGREALEKALVIQESERRKK 

LRFHPRQLYLSVKQGELQKVILMLLDNLDPNFQS 

DQQSKRTPLHAAAQKGSVEICHVLLQAGANINA 

VDKQQRTPLMEAWNNHLEVARYMVQRGGCV 

YSKEEDGSTCLHHAAKIGNLEMVSLLLSTGQVD 

VNAQDSGGWTPIIWAAEHKHIEVIRMLLTRGAD 

VTLTDNEENICLHWASFTGSAAIAEVLLNARCDL 

HAVNYHGDTPLHIAARESYHDCVLLFLSRGANP 

ELRNKEGDTAWDLTPERSDVWFALQLNRKLRL 

GVGNRAIRTEKJICRDVARGYENVPIPCVNGVDG 

EPCPEDYKYISENCETSTMNIDRNITHLQHCTCV 

DDCSSSNCLCGQLSIRCWYDKDGRLLQEFNKIEP 

t>t tff rwn A C <ZC WTR KICK "MR VVO^GfrC VR I .Ol . YR 

TAKMGWGVRALQTIPQGTFICEYVGELISDAEAD 
VREDDSYLFDLDNKDGEVYCIDARYYGNISRFIN 
HLCDPNIIPVRVFMLHQDLRFPRIAFFSSRDIRTGE 
ELGFDYGDRFWDIKSKYFTCQCGSEKCKHSAEAI 
ALEQSRLARLDPHPELLPELGSLPPVNT 

3373 

A 

587 

1584 

PDGRLIVSCSEDKTIK1WDTTNKQCVNNFSDSVG 

FAKFVDFNPSGTCIASAGSDQTVKVWDVRVNKL 

LQHYQVHSGGVNCISFHPSGNYLITASSDGTLKIL 

DLLKGRLIYTLQGHTGPVFTVSFSKGGELFASGG 

ADTQVLLWRTNFDELHCKGLTKRNLKRLHFDSP 

PHLLDIYPRTPHPHEEKVETVEDFFLHLLRLIQSL 

R*SICRSLLPLLWISFLLILPQQQKPVVGLCQTRV 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K-Lysine, L-Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknowi, *=Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 





KJIPVDIS*TLP*CHQNVCQQPRKRKQKT*VTSPV 
rvVK/VblPLAV rDALbHlMbQLNVLTQTVSILEQR 
LTLTEDKLKDCLENQQKLFSAVQQKS 

3374 

A 

398 

21 

WLYPMALSILDIKMSPSWYFHMA1GIINWNTTAG 
LSGTLYPKVPQKYILFDSVILLLGMLRKIRQVCQ 
NVYMKGCSP1TLFKIVHYWPGAVAHAYNPSTLG 
GQ VG/W QIT* GQEFETSLD YM VKPHL Y 

3375 

A 

3 

1051 

VPTQQILAFPEQTNTKDWTVTPEHVLPESQSLLT 

FEEVAMYFSQEEWELLDPTQKALYNDVMQENY 

ETV1SLALFVLPKPKVISCLEQGEEPWVQVSPEFK 

DSAGKSPTGLKLKNDTENHQPVSLSDLEIQASAG 

VISKKAKVKVPQKTAGKENHFDMHRVGKWHQ 

DFPVKKIU<XLSTWKQELLKLMDRHKKDCAREK 

PFKCQECGKTFRVSS\DL\IKHQRIHTEEKPYKCQ 

QCDKRFRWSSDLNKHLTTHQGIKPYKCSWGGKS 

FSQNTNLHTHQRTHTGEKPFTCHECGKKFSQNS 

HLIKHRRTHTGEQPYTCSICRRNFSRRSSLLRHQK 

LHL* REACP VSHF WTCTF 

3376 

A 

137 

2329 

SFESPAPLPSTCFPQERQDPGPCYVSGAMAGLGP 

GVGDSEGGPRPLFCRKGALRQKVVHEVKSHKFT 

ARFFKQPTFCSHCTDFIWGIGKQGLQCQVCSFVV 

HRRCHEFVTFECPGAGKGPQTDDPRNKHKFRLH 

SYSSPTFCDHCGSLLYGLVHQGMKCSCCEMNVH 

RRCVRSVPSLCGVDHTERRGRLQLEIRAPTADEI 

HVTVGEARNLIPMDPNGLSDPYVKLKLIPDPRNL 

TKQKTRTVKATLNPVWNETFVFNLKPGDVERRL 

S VE V WD WDRTSRNDFMG AMSFG VSELLKAPVD 

GWYKLLNQEEGEYYNVPVADADNCSLLQKFEA 

CNYPLELYERVRMGPSSSPIPSPSPSPTDPKRCFFG 

ASPGRLHISDFSFLMVLGKGSFGKVMLAERRGSD 

ELYAIK1LKKDVIVQDDDVDCTLVEKRVLALGG 

RGPGGRPHFLTQLHSTFQTPDRLYFVMEYVTGG 

DLMYHIQQLGKFKEPHAAFYAAEIAIGLFFLHNQ 

GI1YRDLKLDNVMLDAEGHIKITDFGMCKENVFP 

GTTTRTFCGTPDYIAPEIIAYQPYGKSVDWWSFG 

VLLYEMLAGQPPFDGEDEEELFQAIMEQTVTYP 

KSLSRbA VAiCKGr LI lvHFGEAPGASGP* WGNL1 

IRAHGFFPLGFDWERLERL\EIPASFSRPRPCGPQR 

RGIFDKFFTRAAPAXLTPPARLVLDSIDQADFQGF 

TYVNPDFVQPDARSPTSTVHVPVM 

3377 

A 

918 

738 

SSMLWGFSVFRRSWILNCWLSSSQVGISAACKFS 
TLTHTHTHTHTHTRHAPFCGTCLYY 

3378 

A 

1126 

456 

FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNC 

SVISQDDFFKPESEIETDKNGFLQYDVLEALNME 

KMMSAISCWMESARHSVVSTDQESAEEIPILIIEG 

FLLrTsTYKPLDTIWNRSYTLTlPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 

RRNTTNPS/CK* IRKLQG VI 


A 

1126 

456 

F SKLIMK l FIIG1SG V 1 NoGK 11 LAKNLQKiiJLPNC 

SVISQDDFFKPESEDBTDKNGFLQYDVLEALNME 

KIV1MSAISCWMESARHSVVSTDQESAEEIPILIIEG 

FLLTOYTCPLDTIWNRSYTLTIPYEECKRRRSTRVY 

QPPDSPGYFDGHVWPMYLKYRQEMQDITWEVV 

YLDGTKSEEDLFLQVYEDLIQELAKQKCLQVTA* 
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SEQI0 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIatanine, G-Glycine, H-Histidine, 
I=lsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P— Proline, Q = Glutamine, R = Arginine, S = Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /= possible nucleotide deletion, 
\=possib!e nucleotide insertion 





RRNTTNPS/CK*IRKLQGVI 

.3380 

A 

1443 

794 

ARRGELAGGGRASGGRSGGDGGGGGGARAPEG 

VRAPAAGQPRATKGAPPPPGTPPPSPMSSAIERKS 

LDPSEEPVDEVLQIPPSLLTCGGCQQNIGDRYFLK 

AIDQYWHEDCLSCDLCGCRLGEVGRRLYYKLGR 

KLCRRDYLRLFGQDGLCASCDKRIRAYEMTMRV 

KDKVYHLECFKCAACQKHFCVGDRYLLINSDIV 

CEQDIYEWTKINGMI 

3381 

A 

945 

474 

SLKLRKPPLPTDGVHFVFVESQLDFWGPQEMLT 
QQGMALQNYDNKLVKCIEELCQKQEELCWQIQ 
QEEDKKQRLQNEVRQLTEKLACVNEKLARVNE 
NLARKIASCSKFYQTJAETEATYLKILESF*\TLLS 
VRKREAGNLTKATAPDQKSSGGRDS 

3382 

A 

1 

1458 

GIRGKMADRGGVGEAAAVGASPASVPGLNPTLG 

WRERLRAGLAGTGASLWFVAGLGLLYALR1PLR 

LCENLAAVTVFLNSLTPKFYVALTGTSSL1SGLIFI 

FEWWYFHKHGTSFIEQVSVSHLQPLMGGTESSIS 

EPGSPSRNRENETSRQNLSECKVWRNPLNLFRGA 

EYRRYTWVTGKEPLTYYDMNLSAQDHQTFFTC 

DTDFLRPSDTVMQKAWRERNPPARIKAAYQALE 

LN/E* LCHCICSTG * GRSNN YCRC * K VI * TGTQG R 

RNNL*AVTAVPAPKSSA*SSTEERYQCTGIY*LKI 

GNVCKKIRKNKRSSKNNFRFDF* I SvSS YHVFHP* 

KSL\KSLLELQAYPDVQAVLAKYDD1SLPKSAAIC 

YTAALLKTRTVSEKFSPETASTRGLSAAEINAVD 

AIHRAVEFNPHVPKYLLEMKSLILPPEHILKRGDS 

EAIAYAFFHLQHWKRIEGALNLLQCTWEGSKYS 

FPKVTLISLTIH 

3383 

A 

282 

2443 

RGKGFKEFFLGVCQTFIPCLCAEGIQLQFFCSGSG 

SSPLLKDLESMKTGLFFLCLLGTAAAIPTNARLLS 

DHSKPTAETVAPDNTAIPSLRAEAEENEKETAVS 

TEDDSHHKAEKSSVLKSKEESHEQSAEQG\KSS\S 

QELGIEGFKRDSDGSL*VWNL\EYGTNLKGTLDI 

KEDMSEPQEKKLSENTDFLAPGVSSFTDSNQQES 

ITKREENQEQPRNYSHHQLNRSSKHSQGLRDQG 

NQEQDPNISNGEEEEEKEPGEVGTHNDNQERKTE 

\LPREHANSKQEEDNTQSDDILEESDQPTQVSKM 

QEDEFDQGNQEQEDNSNAEMEEENASNVNKHIQ 

ETEWQSQEGKTGLEAISNHKETEEKTVSEALLME 

PTDDGNTTPRNHGVDDDGDDDGDDGGTDGPRH 

SA\SDDYFHPKPGLFWEAERA\HSIAYSPSKLREQ 

REKVHENENIGTTEPGEHQEAKKAENSSNEEETS 

SEGNMRVVHAVDSCMSFQCKRGHICKADQQGKT 

SLVSCQDPVTVCPPTKPLDQVCGTDNQTYASSCH 

LFATKCRLEGTKKGHQLQLDYFG\ASKSIPT\CRD 

FEVIQ\FPLRMRDW\LKNILMQLYEANSEHAGYL 

NEK\QRKKVKKIYL\DEKRLLAGDHPIDLLLRDFK 

KNYHMYVYPVHWQFSELDQHPMDRVLTHSELA 

PLRASLVPMEHCITRFFEECDPNKDKHITLfCEWG 

HCFGEKEEDIDENLLF 

3384 

A 

3166 

928 

PSRPHPTHAAMAGPEGFQYRALYPFRRERPEDLE 

LLPGDVLVVSRAALQALGVAEGGERCPQSVGW 

MPGLNERTRQRGDFPGTYVEFLGPVALARPGPR 

PRGPRPLPARPRDGAPEPGLTLPDLPEQFSPPDVA 

PPLLVKLVEAIERTGLDSESHYRPELPAPRTDWSL 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=GJutamic Acid, F=Pheny lata nine, G=Glycine, H=Histidine, 
I=Isoleucine, KHLysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon,/=possiblc nucleotide deletion, 
V=possible nucleotide insertion 





SDVDQWDTAALADGIKSFLLALPAPLVTPEASAE 

ARRALREAAGPVGPALEPPTLPLHRALTLRFLLQ 

HLGRVASRAPALGPAVRALGATFGPLLLRAPPPP 

SSPPPGGAPDGSEPSPDFPALLVEKLLQEHLEEQE 

VAPPALPPKPPKAK\PASTVPGPNGGSPPSL\QDA 

EWYWGDVISREEVNEKLRDTPDGTFLVRDASSKI 

QGEYTLTLRKGGNNKLIKVFHRDGHYGFSEPLTF 

CSVVDLINHYRHESLAQYNAKLDTRLLYPVSKY 

QQDQIVKEDSVEAVGAQLKVYHQQYQDKSREY 

DQLYEEYTRTSQELQMKRTAIEAFNETIKIFEEQG 

QTQEKCSKEYLERFRREGN/QTKEMQRILLNSER 

LKSRIA\EIHESRT\KL\EQQLLVPRASDNKRD/IDK 

PH*TSLKPDLMQLRKJRJDQYLVWLTQKGARQKK 

INEWLGIKNETEDQYALMEDEDDLPHHEERTWY 

VGKINRTQAEEMLSGKRDGTFLIRESSQRGCYAC 

SVVVDGDTKHCVIYRTATGFGFAEPYNLYGSLK 

ELVLHYQHASLVQHNDALTVTLAHPVRAPGPGP 

PPAAR 

3385 

A 

43 

2372 

TRDVNSWKELCFNHYNKE r n>JCYRTTRKWTNY 

KIIFLGPFRELRSQGNQVILNLGKERCQLRETGLK 

LYLPGMDSARHfflSHSTSAGPIPSQKEEEMTESQ 

GTVTFKDVAIDFTQEEWKRLDPAQRKLYRNVML 

♦NYNNL1TVGYPFTKPDVIFKLEQEEKPWVMEEE 

VLRRHWQGEIWGVDEHQKNQDRLLRQVEVKFQ 

KTLTEEKGNECQKKFANVFPLNSDFFPSRHNLYE 

YDLFGKCLEHNFDCHNNVKCLMRKEHCEYNEP 

VKSYGNSSSHFVITPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFIKMSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLfDHEKIHTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRIASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSMLIIHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

KSNLIAHEKIHSGEKPYECNECGKAFSQKQNFIT 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFITHQKIHTRE/KPFKCNHCGKGFNQTLDLIRH 

LRIHTGEKPYECSNCRKAFSHKEKLIKHYKIHSRE 

QSYKCNECGKAFimSNLIRHQRIHTGEKPYACK 

ECEKSFSQKSNLIDHEK1HTGEKPYECNECGKAFS 

QKQSLIAHQKVHTGEKPYACNECGKAFPRJASLA 

LHMRSHTGEKPYKCDKCGKAFSQFSML1JHVRIH 

TGEKPYECNECGKAFSQSSALTVHMRSHTGEKP 

YECKECRKAFSHKKNFITHQKIHTREKPYECNEC 

GKAFIQMSNLVRHQRIHTGEKPYICKECGKAFSQ 

VCTMT 1 A UT7fc r TUCrn?T^'DVT30XTT?r i r ,, l/" A TTCrM/" rVMTT TT 

JvolNJLJAjrlbKIrloOriJKJ^ Y nCNcCLrJs7\roi^Js.l^iNrl 1 

HQKVHTGEKPYDCNECGKAFSQIASLTLHLRSHT 

GEKPYECDKCGKAFSQCSLLNLHMRSHTGEKPY 

VCNECGKAFSQRTFLIVHMRGHTGEKPYECNEC 

GKAFSQSSSLTIHIRGHTGEKPYECKECRKAFSHK 

KNFITHQKIHTRENPLSVIIVEKASIRLWTSSDI 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 

tif'i mi/> A f\r\ I?— PK«*nvlol'j fii n*» — C~l 1 \; i np U— U i ch/linp 

I=Isoleucine, K=Lysine, L=Leucine, iYl=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 

3386 

A 

201 

1032 

WDDYPQGALRRREAAEGLHFLGPPGRVRGQLR 

GITGPA WYCHSPSHSLLS AFCHLPTPSRCPAMAR 

PPVPGSVVVPNWHES/RRGQGVPGLHSAQEPPAG 

VWAA*AASAAAA\LSIDTASYKIFVSGKSGVGKT 

ALVAKLAGLEVPVVHHETTGIQTTVVFWPAKLQ 

ASSRVVMFRPEF\TOCGESALKKFDHMLLACME 

NTDAFLFLFSFTDRASFEDLPGQLARIAGEAPGV 

VRMVIGSKFDQYMHTDVPERDLTAFRQAWELPL 

LRVKSVPGRRLG 

3387 

A 

86 

96 

GSSPDPASLITMKNQDKKNGAAKQSNPKSSPGQP 

EAGPEGAQERPSQAAPAVEAEGPGSSQAPRKPEG 

AQARTAQSGALRDVSEELSRQLEDILSTYCVDNN 

QGGPGEDGAQGEPAEPEDAEKSRTYVARNGEPE 

PTPWNGEKEPSKGDPNTEEIRQSDEVGDRDHRR 

PQEKKKAKGLGKEITLLMQTLNTLSTPEEKLAAL 

CKKYAELLEEHRNSQKQMKLLQKKQSQLVQEK 

DHLRGEHSKAVLARSKLESLCRELQRHNRSLKE 

EGVQRAREEEEKRKEVTSHFQVTLNDIQLQMEQ 

HNERNSKLRQENMELAERLKKLffiQYELREEHID 

KVFKHKDLQQQLVDAKLQQAQEMLKEAEERHQ 

REKDFLLKEAVESQRMCELMKQQETHLKQQLA 

LYTEKFEEFQNTLSKSSEVFTTFKQEMEKMTKKI 

KKLEKETTMYRSRWESSNKALLEMAEEKTVRD 

KELEGLQVKIQRLEKLCRALQT/GAQ*PVRGQRW 

G SHRTS A VRIFS 

3388 

A 

98 

3197 

ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKK 

NKGKERRDLDDLKKEVAMTEHKMSVEEVCRKY 

NTDCVQGLTHSKAQEILARDGPNALTPPPTTPEW 

VKFCRQLFGGFSILLWIGAILCFLAYGIQAGTEDD 

PSGDNLYLGIVLAAVVIITGCFSYYQEAKSSKIME 

SFKNMVPQQALVIREGEKMQVNAEEVVVGDLV 

EIKGGDRVPADLRIISAHGCKVDNSSLTGESEPQT 

RSPDCTHEVNPLKTRNITFFSNNFVEGTARGVVVA 

TGDRTVMGRIATLASGLEVGKTPIAIEIEHFIQLIT 

GVAVFLGVSFFILSLILGYTWLEAVIFLIGIIVANV 

PEGLLATVWCLTLTAKRMARKNCLVKNLEAVE 

TLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIH 

EADTTEDQSGTSFDKSSHTWVALF*H/LLGFCNR 

PVFKGGQDNIPVLKRDVAGDASESALLKCIELSS 

GSVKLMRERNKKVAEIPFNSTNKYQLSIHETEDP 

NDNRYLLVMKGAPERILDRCSTILLQGKEQPLDE 

EMKEAFQNAYLELGGLGERVLGFCHYYLPEEQF 

PKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAA 

VPDAVGKCRSAGIKVIMVTGDHPITAKAIAKGV 

GHFEGNETVEDIAARLNIPVSQVNPRDAKACVIH 

GTDLKDFTSEQIDEILQNHTEIVFARTSPQQKLIIV 

EGCQRQGAIVAVTGDGVNDSPALKKADIGVAM 

GIAGSDVSKQAADMILLDDNFASrVTGVEEGRLl 

FDNLKKSIAYTLTSNIPEITPFLLFIMANIPLPLGTI 

TILCIDLGTOMVPAISLAYEAAESDIMKRQPRNPR 

TDKLVNERLISMAYGQIGMIQALGGFFSYFVILA 

ENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQW 

TYEQRKVVEFTCHTAFFVSIVVVQWADLIICKTR 

RNSVFQQGMKNK1LIFGLFEETALAAFLSYCPGM 

DVALRMYPLKPSWWFCAFPYSFLIFVYDEIRKLI 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E~Glutaniic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoteucine, K=Lysine, L=Lcucine, M=Metliionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop co do n,/= possible nucleotide deletion, 
V=possible nucleotide insertion 





LRRNPGGWVEKETYY 

3389 

A 

45 

5250 

VERLLGCRNSKRTWRMLISKNMPWRRLQGISFG 

MYSAEELKKLSVKSITNPRYLDSLGNPSANGLYD 

LALGPADSKEVCSTCVQDFSNCSGHLGHIELPLT 

VYNPLLFDKLYLLLRGSCLNCHMLTCPRAVIHLL 

LCQLRVLEVGALQAVYELERILNRFLEENPDPSA 

SEIREELEQYTTEIVQNNLLGSQGAHVKNVCESK 

SKLIALFWKAHMNAKRCPHCKTGRSVVRKEHNS 

KLTITFP AM VHRTAGQKDSEPLG IEE AQI GKRG Y 

LTPTS A REHLS A L WKNEG FFLN Y LFSGMDDDGM 

ESRFNPSVFFLDFLVVPPSRYRPVSRLGDQMFTN 

GQTVNLQAVMKDVVLIRKLLALMAQEQKLPEE 

VATPTTDEEKDSLIAIDRSFLSTLPGQSLIDKLYNI 

W1RLQSHVNIVFDSEMDKLMMDKYPGIRQILEK 

KEGLFRKHMMGKRVDYAARSVICPDMYINTNEI 

GIPMVFATKLTYPQPVTPWNVQELRQAVINGPN 

VHPGASMVINEDGSRTALSAVDMTQREAVAKQ 

LLTPATGAPKPQGTKIVCRHVKNGDILLLNRQPT 

LHRPSIQAHRARILPEEKVLRLHYANCKAYNADF 

DGDEMNAHFPQSELGRAEAYVLACTDQQYLVP 

KDGQPLAGLIQDHMVSGASMTTRGCFFTREHYM 

ELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQVV 

STLLINIIPEDHIPLNLSGKAKITGKAWVKETPRSV 

PGFNPDSMCESQVIIREGELLCGVLDKAHYGSSA 

YGLVHCCYEIYGGETSGKVLTCLARLFTAYLQL 

YRGFTLGVEDILVKPKADVKRQRIIEESTHCGPQ 

AVRAALNLPEAASYDEVRGKWQDAHLGKDQRD 

FNMlDLKFKEEVNHYSNErNKACMPFGLHRQFPE 

NTLQLM VQSG AKG STVNTMQISCLLGQIELEGRS 

TPLMASGKSLPCFEPYEFTPRAGGFVTGRFLTGIK 

PPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIK 

HLEGLVVQYDLTVRDSDGSVVQFLYGEDGLDIP 

KTQFLQPKQFPFLASNYEVIMKSQHLHEVLSRAD 

PKKALHHFRAIKKWQSKHPNTLLRRGAFLSYSQ 

KIQEAVKALICLESENRNG1VRPWDS/G/RMLRMW 

YELDEESRRKYQKKAAACPDPSLSVWRPDIYFAS 

VSETFETKVDDYSQEWAAQTEKSYEKSELSLDR 

LRTLLQLVKWQRSLCEPGEAVGLLAAQSIGEPST 

QMTLNTFHFAGRGEMNVTLGIPRLREILMVASA 

NIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCL 

GEVLQKIDVQESFCMEEKQNKFQVYQLRFQFLP 

HAYYQQEKCLRPED1LRFMETRFFKLLMESIKKK 

NNKASAFRNVNTRRATQRDLDNAGELGRSRGE 

QEGDEEEEGHIVDAEAEEGDADASDAKRKEKQE 

EEVDYESEEEEEREGEENDDEDMQEERNPHREG 

ARKTQEQDEE VGL/GH* GGPVPSRPPDA APETHP 

QPG APGA\EAMERR VQA VREIHPF1DD YQ YDTEE 

SLWCQVTVKLPLMKINFDMSSLVVSLAHGAVIY 

ATKGITRCLLNETTNNKNEKELVLNTEGINLPELF 

KYAEVLDLRRLYSNDIHAIANTYGIEAALRVIEK 

EIKDVFAVYGIAVDPRHLSLVADYMCFEGVYKP 

LNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSH 

DELRSPSACLVVGKVVRGGTGLFELKQPLR 

3390 

A 

2 

2080 

ILPPLEGPPAQASPSSTMLGEGSQPDWPGGSRYD 
LDEIDAYWLELINSELKEMERPELDELTLERVLE 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E,=v>Iutamic Acta, £=rneny!aian!ne, uiycine, M -ruscdine, 
I=Isoleucine, K=Lysine, I^=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





ELETLCHQNMARAIETQEGLGIEYDEDVVCDVC 

RSPEGEDGNEMVFCDKCNVCVHQACYGILKVPT 

GSWLCRTCALGVQPKCLLCPKRGGALKPTRSGT 

KWVHVSCALW1PEVSIGCPEKMEPITKJSHIPASR 

WALSCSLCKECTGTCIQCSMPSCWTAFHVTCAF 

DHGLEMRTILADNDEVKFKSFCQEHSDGGPRNE 

PTSEPTEPSQAGEDLEKVTLRKQRLQQLEEDFYE 

LVEPAEVAERLDLAEALVDF1YQYWKLKRKANA 

NQPLLTPKTDEVDNLAQQEQDVLYRRLKLFTHL 

RQDLERVRNLCYMVTRRERTKHAICKLQEQIFH 

LQMKLIEQDLCRAGLSTSFPIDGTFFNSWLAQSV 

QITAENMAMSEWPLNNGHREDPAPGLLSEELLQ 

DEETLLSFMRDPSLRPGDPARKARGRTRLPAKK 

KPPPPPPQDGPGSRTTPDKAPKKTWGQDAGSGK 

GGQGPPTRKPPRRTSSHLPSSPAAGDCPILATPES 

PPPLAPETPDEAASVAADSDVQVP\GPAASPKPLG 

RLRPPPREPR*T\RRLPGC/ARPDAGDGDHLSAVA 

ERPKV\SLHFDTETDG\YFS\DGEMSNS\DV\EAED 

GGVQRGPREAGAKEWVRMGVLAS 

3391 

A 

1555 

327 

NSFLHFLHLKVRTMFLFPSFPVLLLSVVTASCSKT 

KACADTQKTCSMITCGIPVTNGTPGRDGRDRPK 

GEKGEPGLGQVSVAS*ISTSGRCSSKSVLEPATRG 

LKHRLGEAPLSSGPMLHSEQPL*NAIASKTKLFV 

DSLGSHISTQELGVCGCPFRGVSCLVGELALVQA 

LH*VAGESFFFGSDHWLIGCAGGEQEWSIELLGK 

KKRVTATGSSSLCLATGQGLRGLQGPPGKMGPP 

GNTGTSGIPGPRGQKGDRGDNSVAEAICLANLER 

KL* SLRSELDHTKKL* PFSLGKAM SGKKLF VTNGE 

RMPFSKVKALCAGLQATVAAPKNAEENKAIQDV 

AKDTAFLGITDEATEGQFMYLTGGRLTYSNWKK 

DEPNDHGSGEDCVILLNNGLWNGISCTSSFIAICE 

FPA 

3392 

A 

218 

1773 

GGSRRNQRRSIPVLGYFLKQKKMTKAQESLTLE 

DVAVDFTWEEWQFLSPAQKDLYRDVMLENYSN 

LVSVGYQAGKPDALTKLEQGEPLWTLEDEIHSP 

AHPEIEKADDHLQQPLQNQKILKRTGQRYEHGR 

TLKSYLGLTNQSRRYNRKEPAEFNGDGAFLHDN 

HEQMPTEIEFPESRKPISTKSQFLKHQQTHNIEKA 

HECTDCGKAFLKKSQLTEHKR1HTGKKPHVCSL 

CGKAFYKKYRLTEHERAHRGEKPHGCSLCGKAF 

YKRYRLTEHERAHKGEKPYGCSECGKAFPRKSE 

LTEHQRIHTGIKPHQCSECGRAFSRKSLLVVHQR 

THTGEKPHTCSECGKGFIQKGNLNIHQRTHTGEK 

PYGCIDCGKAFSQKSCLVAHQRYHTGKTPFVCPE 

CGQPCSQKSGLIRHQKIHSGEKPYKCSDCGKAFL 

TKTMLIVHHRTHTGERPYGCDECEKAYFYMSCL 

VKHKRIHSREKRGD/CSEGGKSFHSKSQLKS* *TC 

AGEKPC*YGNCGNGGRAV 

3393 

A 

46 

1464 

ARSLSGAPSGSSRQDGTSLLRTGAGYSSSQSIETL 

SLPPGPSHLVGDKSQGGRSCQGQITSAASGKTSK 

SEPNHVIFKKISRDKSVT\IYLGNRDY\IDHV\SQV 

QPVDGWLVDPDLVKGKKVYVTLTCAFRYGQE 

DIDVIGLTFRRDLYFSRVQVYPPVGAASTPTKLQ 

ESLLKKLGSNTYPFLLTFPDYLPCSVMLQPAPQD 

SGKSCGVDFEVKAFATDSTDAEEDKIPKKSSVRL 
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NO: 
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beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenyla!anine, G=Glycine, H=Histidine t 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
V=possible nucleotide insertion 





L1RKVQHAPLEMGPQPRAEAAWQFFMFVDKPLH 

LAVSLNKRDLFPMGSPIPVPVSVP\NNTEKPVKKI 

KA\SVEQVANVVLYS\SDY\YVKPVAMEEAQEKV 

PPNSTWTKA\LTLL\PWLVNNRERRGIALDGK1KH 

EDTNLASSTIDGEGIDRKRS WEIL VS YPDQR* SSTV 

SGFLGRASPSQ*SRPT*RSQFRL\MHPQP\EDPA\K 

ESYQDANLVF\EEFARP*ILKDAGEA*\EGKRDQE 

3394 

A 

211 

1591 

RPPTMAADQRPKADTLALRQRL1SSSCRLFFPEDP 

VKIVRAQGQYMYDEQGAEYIDCISNVAHVGHCH 

PLVVQAAHEQNQVLNTNSRYLHDNIVDYAQRLS 

ETLPEQLCVFYFLNSGSEANDLALRLARHYTGH 

QDVVVLDHAYHGHLSSLIDISPYKPRNLDGQKE 

WVHVAPLPDTYRGPYREDHP\THVEDGLEKAFS* 

KRVVQGRNRQICRRQIAAFFAESLPSVGGQIIPPA 

GYFSQVAEHIRKAGGVFVADEIQVGFGRVGKHF 

WAFQLQGKDFVPDIVTMGKSIGNGHPVACVAAT 

QPVARAFEATGVEYFNTFGGSPVSCAVGLAVLN 

VLEKEQLQDHATSVGSFLMQLLGQQKIKHPIVG 

DVRGVGLFIGVDLIKDEATRTPATEEAAYLVSRL 

KENYVLLSTDGPGRNILKFKPPMCFSLDNARQV 

VAKLDAILTDMEEKVRSCETLRLQP 

3395 

A 

1 

1424 

FRDGFSLRCGCNAELPGRGGDDAADRAIQRFLR 

TGAAVRYKVMKNWGVIGGIAAALAAGIYVIWG 

P1TERKKRRKGLVPGLVNLGNTCFMNSLLQGLSA 

CPAFIRWLEEFTSQYSRDQKEPPSHQYLSLTLLHL 

LKALSCQEVTDDEVLHASCLLDVLRMYRWQISS 

FEEQDAHELFHVITSSLEDERDRQPRVTHLFDVH 

SLE\HSQK*LPKQITCRTRGSPHPTSNHWKSQHPF 

HGRLTSNMVCKHCEHQSPVRFDTFDSLSLS1PAA 

TWGHPLTLDHCLHHFISSESVRDWCDNCTKIEA 

KGTLNGEKVEHQRTTFVKQLKLGBCLPQCLCIHL 

QRLSWSSHGTPLKJIHEHVQF^EFLMMDIYKYHL 

LGHKPSQHNPKLNKNPGPTLELQDGPGAPTPGL \ 

NQPGAPKTQIFMNGACSPSLLPTLSAPMPFPLPV 

VPDYSSSTYLFRLMGSCRPPWETWHSGTLCSFTD 

GPHL 

3396 

A 

109 

107 

TQEAGLIFFSPPFSLSLSLSLPLSLFLLSHPHSRTPP 

NRTPRRTRIPQRPAVMYSPLCLTQDEFHPFIEALL 

PrTVRAFAYTWFNLQARXRKYFKKHEKRMSKEE 

ERAVKDELLSEKPEVKQKWASRLLAKLRKDIRP 

EYREDFVLTVTGKJKPPCCVLSNPDQKGKMRRID 

CLRQADKVWRLDLVMVELFKGIPLESTDGERLV 

KSPQCSNPGLCVQPHHIGVSVKELDLYLAYFVH 

A ADSSQSESPSQAK*R*H* GPARK WDI WGFQXDS 

FVT\SGVF\SVT*A*LRVSQTP1\AAG\TGPNFSLSD 

LESSSYYSMSPGAMRRSLPSTSSTSSTKRLKSVED 

EMDSPGEEPFYTGQGRSPGSGSQSSGWHEVEPG 

MPSPTTLKKSEKSGFSSPSPSQTSSLG\TAFTQHHR 

PVITGTQSKFHIATPSIL\HFPRHSPFFQQPGPYFSH 

PAIRYHPQETLKEFVQLVCPDAGQQAGQPNG SS 

QGKVHNPFLPTPMLPPPPPPPMARPVPLPVPDTK 

PPTTSTEGGAASPTSPTTRS/PGRTRPQQPFL/SYG 

PP*PSNALIGGGGGGAGERAGERADLEM 

3397 

A 

1 

2002 

TGTLTEDGLDVMGVVPLKGQAFLPLVPEPRRLP 
VGPLLRALATCHALSRLQDTPVGDPMDLKMVES 
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nucleotide 

location 

corresponding 

to first amino 

acid residue of 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alaninc OCysteine, D=A spar tic Add, 
E=G!utamic Acid, F=Phcnyf alanine, G=GIycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionme, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S-Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y-Tyrosinc, 
X=Unknown, *-Stop codon,/=posstble nucleotide deletion, 
\=possible nucleotide insertion 





TGWVLEEEPAADSAFGTQVLAVMRPPLWEPQLQ 

AMEEPPVPVSVLHRFPFSSALQRMSVVVAWPGA 

TQPEAYVKGSPELVAGLCNPETVPTDFAQMLQS 

YTAAGYRVVALASKPLPSVPSLEAAQQLTRDTV 

EGDLSLLGLLVMRNLLKPQTTPVIQALRRTRJRA 

VMVTGDNLQTAVTVARGCGMVAPQEHLIIVHA 

THPERGQPASLEFLPMESPTAVNGVKDPDQAAS 

YTVEPDPRSRHLALSGPTFGIIVKHFPKLLPKVLV 

QGTVFARMAPEQKTELVCELQKLQYCVGMCGD 

GANDCGALKAADVGISLSQAEASVVSPFTSSMA 

SIECVPMVIREGRCSLDTSFSVFKYMALYSLTQFI 

SVLILYTINTNLGDLQFLArDLVITTTVAVLMSRT 

riPAi \/t f~m \/I?T>Pr~l A T I Q\/P\/T CC1 T t r^\/f\n wr/"" 
KJtr AL V LU tv V Kr r VjAi^Lo VrV L*aoi>L.JL»Vi JY1 V V 1 0 

VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNYEN 
TVVFSLSSFQYLILAAAVSKGAPFRVRPLTNNVPF 
LLASAL*SSVLVVLVLSPGLLHGPLALRNITDTGF 
KLLLVGLVTLNFVGGLHAGERARPVPPRLPAPPP 
AQAGVSKKRFKQLERELAEQPWPPLPAGPLR 

3398 

A 

758 

1368 

FPFRMLTGYLYLMWRRKAFWSGTQRHPLPGGL 
tsjuUvKJrU KO r W r A ru U I^vj V Or o AL * IvACjibr r AN 
RPGQGE/PGLJSPKPVTEVLPDVQGAPVPVPPLPT 

pDCT DT-IT nMr\DD/T\/nUV( f C!T?C\l/l/'r>Ci r ^/^'DXr*'D A * 

rroJLrrlLv^lNi^rr/ 1 Vv^H YLJuoroWJ^ov^Orri^KA 

PSPLPPAAMRPDG*PGPASQGPDQPG\PCPPASLP 

TSPPGKGFQKTETRKHPPPRQQHKPKCTANRPLA 

SFL 

3399 

A 

906 

1091 

HHHHHHHHHHHHHLVAFGKVQ*LQNSPSSSSSS 
SSGCFWQARFSSYRTLHHHHHHHHHHHHH 

3400 

A 

1838 

325 

PFLSVHRSPHGPSKLCDDPQASLVPEPVPGGCQE 

PEEMSWPPSGE1ASPPELPSSPPPGLPEVAPDATST 

GLPDTPAAPETSTNYPVECTEGSAGPQSLPLPILE 

PVKNPCSVKDQTPLQLSVEDTTSPNTKPCPPTPTT 

PETSPPPPPPPPSSTPCSAHLTPSSLFPSSLESSSEQ 

KFYNFVILHARADEHIALRVSGRSWEALGVPDG 

ATFCEDFQVPGRGELSCLQDAIDHSAFIILLLTXSN 

\FDCR\LSLHQVNQAMMSNLT\RQGSQDCVIP\FLP 

\LESSPARLSSDTASLLSGLVRLDEHSQIFARKVA 

>JTPT^PT-TR1 AAPk r ATVyfWP^PATYlT? AT PT?n^fYtJT Ti 
IN 1 V ISJ^rlivU v A IvJvAiVi. W KiVCl^L/ 1 IvAi^KJil^ol^rlLl/ 

GERMQAAALNAAYSAYLQSYLSYQAQMEQLQV 

AFGSHMSFGTGAPYGARMPFGGQVPLGAPPPFP 

TWPGCPQPPPLHAWQAGTPPPPSPQPAAFPQSLP 

FPAVPKPFPTASTAPPSEPKGWQP\LIIHHAQMVT 

SWG*NKH\MWNQRGSQAPEDKTQEAE 

3401 

A 

153 

1389 

EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 

KKE1SPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DWFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

\/™KT A ^.A/f/^XTC^TTTTTT T"I VV 1 l'l'kT\/T*ri CTr TO C\ /XT A 

GODVNII1TYKTSL*NTNLGDVAKGLOSSNFGVNI 
QTYTPSLTPQTKTG VXNLLTL VE* M WQETYFRME 
NLQLII/CPEDASTKKANVILPVESSKSFQEFYSTS 
CLSPCEKNWNLKKGVFNKSRCTICSKLAEVWIFI 
PKLLFRLTVI1LTFKCYYVLFHLHNARVLDV 

3402 

A 

153 

1389 

EWGWLGAAQPPEEEAEAEDQESPSSLCREALAEI 
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NO: 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D-Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





KKEISPLFIGMEKCSVGGLELTEQTPALLGNMAM 

ATSLMDIGDSFGHPACPLVSRSRNSPVEDDDDDD 

DVVFIESIQPPSISAPAIADQRNFIFASSKNEKPQG 

NYSVIPPSSRDLASQKGNISETIVIDDEEDIETNGG 

AEKKSSCFIEWGLPGTKNKTNDLDFSTSSLSRSK 

VMAnMHlSl^nTTTPI TT k'VTITMVTTT PTrn<?QVMA 
V iN/\VjlvlVjlNoVJl I 1 JCL, lJLJVl Hi IN V 1 1 JL-H 1 VJloo V IN A 

GQDVNII1TYKTSL*NTNLGDVAKGLQSSNFGVNI 

QTYTPSLTPQTKTGV\NLLTLVE*MWQETYFRME 

NLQLII/CPEDASTKKANV1LPVESSKSFQEFYSTS 

CLSPCENNWNLKKGVFNKSRCT1CSKLAEVW1F1 

PKLLFRLTVIILTFKCYYVLFHLHNARVLDV 

3403 

A 

609 

2765 

SRHCTPAERQNETHRAPDFAMSAVLGHQPPFFPA 

LTLPPNGAAALSLPGALAKPIMDQLVGAAETGIP 

FSSLGPQAHLRPLKTMEPEEEVEDDPKVHLEAKJE 

LWDQFHKRGTEMVITKSGRRMFPPFKVRCSGLD 

KXAKYILLMDIIAADDCRYKFHNSRWMVAGKA 

DPEMPKRMY1HPDSPATGEQWMSKVVTFHKLKL 

TNNISDKHGFTILNSMHKYQPRFHIVRANDILKLP 

YSTFRTYLFPETEFIAVTAYQNDKITQLKIDNNPF 

AKGFRDTGNGRREKRKQLTLQSMRVFDERHKK 

ENGTSDESSSEQAAFNCFA\QASSPAA\PL*RTSNL 

KDF\SPSRG*RATPEAEEQRGSTAPRPATRAKISP 

HPRRRSPAVTRAAPAVKAHLFAAERPRDSGRLD 

KASPDSRHSPATISSSTRGLGAEERRSPVREG\QA 

PAKVEEARALPGKEAFAPLTVQTDAAAAHLAQG 

PLPGLGFAPGLAGQQFFNGHPLFLHPSQFAMGG 

AFSSMAAAGMGPLLATVSGASTGVSGLDSTAM 

A Q A A A A Qn A C A A TT DCUT nniT\n A C/^I^T A 

MSPFGSLFPYPYTYMAAAAAA/SSAAASASVHRT 
PVFNLNTMRPRLRYSPYSIPVPVPDGSSLLTTALPS 

\yf A A A AflPT Tariff AAA! A A CPA QW AVnQHQPT WC 
IVi/VAA A vj r L» V O JVA A AL AA o Jr Ao \ V A V UoLroJbJLfJNo 

RSS\TLSSSSMSLSPKLCAEKEAATSELQSIQRLVS 
GLEAKPDRSRS A SP 

3404 

A 

1082 

1308 

LKKFLEVPQSYSLLLSSPFLQ\WRA*RPQNAIG*Q 
FIIKTLVFFGIMRSAGDVLSTQVSCALRIMRTAGC 
SHSSP 

3405 

A 

1553 

559 

PRPPTQRLSRFAPPCRTAEFPFRRRAVVTRPAPPR 
ACTVVGRSSPVTGLAVGAAVAMLTVAARSRPFA 
PVLSATSRGVAGALT\P*MQATVPATPEQPVLDL 
KRPFLSRESLSGQAVRRPLVASVGLNVPASVCYS 

HTniT^VPHP^IPYPRT FVT n^TJfQQRPQQP A RKTiP*? 
tl 1 UlJv V JrL/r oC I ivt\JL*JZ» V L>LJo I ivoojAjDooJC/VjxJvvjr o 

YLVTGVTTVGVAYAAKNAVTQFVSSMSASADV 
LALAKIEIKLSDIPEGKNMAFKWRGKPLFVRHRT 
QKEIEQEAAVELSQLRDPQHDLDRVKKPEWVILI 
GVCTHLGCVPIANAGDFGGYYCPCHGSHYDASG 
RIRLGPAPLNLEVPTYEFTSDDMVIVG 

3406 

A 

83 

2671 

CLYPDFCRSVTCAMPCFTHRSCREDPGTSESREM 

DPVAFKDVAVNFTQEEWALLDISQKNLYREVML 

ETFWNLTSIGKKWKDQNBBYEYQNPRRNFRSVT 

FFK VNFTKFDSHCGF 1 V 1 V VPDDR T "WFOKKKASP 

EVKSCDSFVCEVGLGNSSSNMNIRGDTGHKACE 

CQEYGPKPWKSQQPKKAFRYHPSLRTQERDHTG 

KKPYACKECGKNIIYHSSIQRHMVVHSGDGPYK 

CKFCGKAFHWLSLYLJMERTHTGEKPYECKQCG 

KSFSYSATHRiHERTHIGEKPYECQECGKAFHSPR 
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corresponding 
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to last amino 
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sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, OGIycine, H=Histidine, 
I=IsoIeucine, K^Lysine, L==Leucine, M=Meth»onine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Sertne, 
T=Threonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





SCHRHERSHMGEKAYQCKECGKAFMCPRYVRR 

HERTHSRKKLYECKQCGKALSSLTSFQTHIRMHS 

GERPYECKTCGKGFYSAKSFQRHEKTHSGEKPY 

KCKQCGKAFTRSGSFRYHERTHTGEKPYECKQC 

GKAFRSAPNLQSHGRTHTGEKPYECKECGKAFIF 

VNNLQSHERTQTHIRIHSGERRYKCK1CGKGFYC 

PKSFQRHEKTHTGEKLYEC/TATFSSSFSSSSSF*Y 

HERTHTGEKPYKCEQCGKAFRAVSIL*MHGRTH 

PEEKPYECEQ*RKAFRSAPHL*IRGRTBNGEKPY 

ACKKCGKPFGSAQNLRIHERTQTHIMHSVERPYK 

CKICGRGFYSAKSFQTHEKSYTGEKPYECKQCG 

KAFVSFTSFRYHERTHTGENPYECKQFGKAFRSV 

KNLRFHKRTHTGEKPCEYMKRLTLEGNTMNAS 

NVAKLSLLPVLFNIMKEFTLGRNPISVSNVRKPLF 

LPLLFNIMKGLTWERNPMSVCHVGKPSFLLVPFN 

1MKGLTLERSPMNISNVGKPSDQPRTFKCMEGLT 

LEKNPMNVSSMGKRSDLTRFFEYR 

3407 

A 

1426 

3 

PAAPSGASPGRVCGVETARPLGVQRRQSADEGP 

PGVAGLRHEPPTVWLGSVAHRGTWVCAHRWFG 

PAVTRAAQAATMVKLLVAKILCMVGVFFFMLL 

GSLLPVKIIETDFEKAHRSKKILSLCNTFGGGVFL 

ATC\LTALLARC*GKSSRRSWSLGHISTDYPL\AE 

TILLLGFFMTVFLEQLILTFAQENAVLHRPGDLQR 

RIGRGQRLGV*EPLHGGRAGPRAVRGAPRPRPQP 

ERAGPLA\PSPVRLLSLAFALSAHSVFEGLALGLQ 

EEGEKVVSLFVGVAVHETLVPVALGISMAGSAM 

PLRDAAKLAVTVSPMIPLGIGLGLGIEKAQGVPG 

SVASVLLQGPGGRHLSLFITFPGKSWPRSWRKKS 

DRLLKVLF\LVVGYTVLAGMGLPQVVSGLAIVPA 

AGSPPGAPGRTQAASPGRASPKSEHCGPGPPPVH 

KGPPGTRLCPRSYTLSLRALLLFKILLSLKSLYQK 

KK 

3408 

A 

106 

4514 

EARDRLAQSRAKEKELNSVASELSARQEESEHSH 

KHLIELRREFKKNVPEEIREMVAPVLKSFQAEVV 

ALSKRSQEAEAAFLSVYKQLIEAPALWELKLKSR 

PALGDSRVQQGQHDPKTDNQNTQQKAGFKEGW 

LAEASEREAFGPGFKDPVPVFEAARSLDDRLQPP 

SFDPSGQPRRDLHTSWKRNPELLSPKALKATQAE 

LLELRRKYDEEAASKADEVGLIMTNLEKANQRA 

EAAQREVESLREQLASVNSSIRLACCSPQGPSGD 

KVNFTLCSGPRLEAALASKDREILRLLKDVQHLQ 

SSLQELEEASANQIADLERQLTAKSEAIEKLEEKL 

QAQSDYEEIKTELSILKAMKLASSTCSLPQGMAK 

PEDSLLIAKEAFFPTQKFLLEKPSLLASPEEDPSED 

DSIKDSLGTEQSYPSPQQLPPPPGPEDPLSPSPGQP 

LLGPSLGPDGTRTFSLSPFPSLASGERLMMPPAAF 

KGEAGGLLVFPPAFYGAKPPTAPATPAPGPEPLG 

GPEPADGGGGGAAGPGAEEEQLDTAEIAFQVKE 

QLLKHNIGQRVFGHYVLGLSQGSVSEILARPKPX 

WRKLHG* *GK£PFIKMKQFLSDEQNVLALRTIQV 

RQRGSITPRIRTPETGSDDA1KSILEQAKKEIESQK 

GGEPKTSVAPLSIANGTTPASTSEDAIKSILEQAR 

REMQAQQQALLEMEVAPRGRSVPPSPPERPSLAT 

ASQNGAPALVKQEEGSGGPAQAPLPVLSPAAFV 

QSHRKVKSEIGDAGYFDHHWASDRGLLSRPYAS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=G!ycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valiue, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=posstble nucleotide insertion 





VSPSLSSSSSSGYSGQPNGRAWPRGDEAPVPPED 

EAAAGAEDEPPRTGELKAEGATAEAGARLPYYP 

AYVPRTLKPTVPPLTPEQYELYMYREVDTLELTR 

QVKEKLAKNGICQRIFGEKVLGLSQGSVSDMLSR 

PKPWSKLTQKGREPFIRMQLWLSDQLGQAVGQQ 

PGASQASPTEPRSSPSPPPSPTEPEKSSQEPLSLSLE 

SSKENQQPEGRSSSSLSGKMYSGSQAPGGIQEIV 

AMSPELDTYSITKRVKEVLTDNNLGQRLFGESIL 

GLTQGSVSDLLSRPKPWHKLSLKGREPFVRMQL 

WLNDPHRVEKLRDMKKLEKKAYLKRRYGLIST 

GSDSESPATRSECPSPCLQPQDLSLLQIKKPRVVL 

APEEKEALRKAYQLEPYPSQQT1ELLSFQLNLKT 

NTVINWFHNYRSRMRREMLVEGTQDEPDLDPSG 

GPGILPPGHSHPDPTPQSPDSETEDQKPTVKELEL 

QEGPEENSTPLTTQDKAQVRIKQEQMEEDAEEE 

a noriDnncpcT TAV/T^xf' ddi/ r?cnTT)r"\r>r>/" , xTr"\/" >, T riv 
Auol^rl^UbUbLDKUl^UrrK^ 

VAPGPLLPGGSTPDCPSLHPQQESEAGERLHPDP 

LSFKSASESSRCSLEVSLNSPSAASSPGLMMSVSP 

VPSSSAPISPSPPGAPPAKVPSASPTADMAGALHP 

SAKVNPNLQRRHEKMANLNN1IYRLERAANREE 

ALEWEF 

3409 

A 

162 

1710 

GPLSPGPYQCRPSLPAQLYPQSLMAAATLRTPTQ 

GTVTFEDVAVHFSWEEWGLLDEAQRCLYRDVM 

LENLALLTSLDVHHQKQHLGEKHFISNVGRALF 

VKTCTFHVSGEPSTCREVGKDFLAKLGFLHQQA 

AHTGEQSNSKSDGGAISHRGKTHYNWGEHTKAF 

SGKHTLVQQQRTLTTERCYICSECGKSFSKSYSL 

NDHWRLHTGEKPYECRECGKSFRQSSSLIQHRR 

GHTAVRPHECDECGKLFSNKSNLIKHRRVHTGE 

RPYECSECGKSFNQRSALLQHRGVHTGEKPYEC 

TECGKSFSHNSSLIKHQRIHSG*\RPYECTECGKSF 

bv^NooLlbHHKVri 1 GbKr YKCobCGKor KQKSAL 

LQHRGVPTGERPYECSECGKFFPYSSSLGKHQRV 

HTGSRPYECSECGKSFTQNSGLIKHRRVHTGEKP 

YECTE*KKSFSHNSSLIKHQRIHSR*KPYE\CKCG 

N\R*HPGESP* VHSECQ/KSFS * RPYLIECHTVHKG 

KTLLICRDVQLI 

3410 

A 

167 

789 

LCMKGISGGVRVAALAARAEREELPVPAMEPQP 

TA Al/fJCDlJDC A \/I CM C\/ A DCOCPDPTAT A l^nAAC 

1 A WUorrlrbA VLt^bb V ArbbbOrC i V 1 AJsXK^yb 

DKLPDLMPPA\EPLGSALELRASLEIDVAE\RGCE 
HGPSQQLPRCP*SWAWSEPWCQRPGCAV*APLP 
Y*REASFIYQSHSPAASGPFHSAGAGAVYLQAGG 
V/GEQEKEAVRKGSGSSSCSQRGP\PPPGMEVCPL 

L»Vjr W Al\-/Jr 

3411 

A 

1040 

887 

ASLSKPAGISTMPWALILLFLLTHSAVSWQAGL 
TQPPSVSKDLR\QTATLTCTGNSNNVGHQGVIWL 
QQHQGHPPKLLSYRNNNRPSGISERLSAYKSGNA 
ASLTIYGLQTEHEAD* *CRPRRKL1PKTARLFFFFL 
IDNEEYLLRVY 

3412 

A 

164 


INJVvJli \Jor\.ol_»OJU llvll^, V !VO^rv^ojrJAjL.v< VV V W IX 1 /\ri_/ 

KHTQRRHQGSHRWTHLGGSTYRAVIFDMGGVLI 

PSPGRVAAEWEVQNRIPSGTILKALMEGGENGP 

WMRFMRAEITAEGFLREFGRLCSEMLKTSVPVD 

SFFSLLTSERVAKQFPVMTEAITQIRAKGLQTAVL 

SNNFYLPNQKSFLPLDRKQFDVIVESCMEGICKP 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G\ utamic Acid, F-Phenylalanine, G=Glycine, H=Hisiidine, 
I=Isoleucine, K= Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R^Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno>vn, *=Stop codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 





DPR1YXLCLEQLGLQPSESIFLDDLGTNLKEAARL 

GIHTIKVNDPETAVKELEALLGFTLRVGVPNTRP 

VKJCTMEIPKDSLQKYLKDLLGIQTTGPLELLQFD 

HGQSNPTYYIRLANRDLVLRKKPPGTLLPSAHAI 

EREFRIMKALANAGVPVPNVLDLCEDSSVIGTPF 

YVMEYCPGLIYKDPSLPGLEPSHRRAIYTAMNTV 

LCKIHSVDLQAVGLEDYGKQGSTTWV/YSSRRA 

RGALLFLDWELSYPWGDPFADVGYSCLAHYLPS 

SFPVLRGINDCDLTQLGIPAAEEYFRMYCLQMGL 

nnTTxni/virvxf a corm\ / a a tt r»/^\/WD Of ffA a 

PP lbN WNr YMArorrKVAAlLvJOV YlvKoL luv^A 

SSTYAEQTGKLTEFVSNLAWDFAVKEGFRVFKE 

MPFTNPLTRSYHTWARPQSQWCPTGSRSYSSVPE 

ASPAHTSRGGLVISPESLSPPVRELYHRLKHFME 

QRVYPAEPELQSHQASAARWSPSPLIEDLKVKQP 

W*GGRSGRTSWRLLALGCHT 

3413 

A 

105 

1573 

PESRHQCFSDRSSHFLTMEMEQEKMTMNKELSP 

DAAAYCCSACHGDETWSYNHPIRGRAKSRSLSA 

SPALGSTKEFRRTRSLHGPCPVTTFGPKACVLQN 

PQTIMHIQDPASQRLTWNKSPKSVLVIKKMRDAS 

LLQPFKELCTHLMEENMIVYVEKKVLEDPAIASD 

ESFGAVKKKFCTFREDYDD1SNQIDFI1CLGGDGT 

LLYASSLFQGSVPPVMAFHLGSLGFLTPFSFENFQ 

SQVTQVIEGNAAVVL/RGSRLKVRVVKELRGKK 

TAVHNGLGEKGSQAAGLDMDVGKQAMQYQVL 

NEV VIDRGPSS YLSN VD VYLDGHLITT VQGD/G * 

GPQHLSWGP*AFLGRE*RLRLSLSGVIVSTPTGST 

AYAAAAGASMIHPNVPAIMITPICPHSLSFRPIVV 

PAGVELKIMLSPEARNTAWVSFDGRKRQEIRHG 

DSISITTSCYPLPSICVRDPVSDWFESLAQCLHWN 

VRKKQAHFEEEEEEEEEG 

3414 

A 

20 

2602 

VIVNKNVNWINYIYYNQQQRAFHELKEKLMSAL 

ALGLPDLTKPFTFYESEREKMAVGVLTQTVGPW 

PRPVAYLSKQLDGVSKGWPPCLRALAATALLAQ 

EADKLTLGQNLNIKAPHAVVTLMNTKGHHWLT 

NARLTKYQSLPCENPHITIEVCNTLNPTTLLPVSE 

SPGEHNCVEVLDSVYSSRPDLRDQPWASSVDWE 

LYMDGSSFINSQGERCAGYAVVTLDAVIKAKLW 

LQGTSAQKAELIALTRAVELSEGQESLEELLGRY 

FYVSHLPAFAKAVAQLCITCRQHNARQSPTVSPH 

IQAYGAAPFEDLQVDFTEMPKCGGNKYLLVLTC 

TYSGWVEAYPTRTEKAYEVTRVLLRDLIPRFGLP 

LRIGSHNGPVFVADLDCVEINVDTGVIWATWIKN 

EKDPVQLQKGKSGPSCTKGQCNPLELV1TNPLDP 

RWKKGERVTLGINGAGLNPRVNILVRGEVYKCS 

LEPVFQTFYDELNVPITEFPGKTRNLFLQLAEHV 

AQSLTVTSCYVCGGTVIADQWPWEARELVPTDP 

VPDEFPAQKNHPDNFWVLKASIIRQYYIARVEKD 

FTLPVGRLHGG/RSNHTEKNPFSKFPKLQTV*AHP 

ESHRDWTAPTGLYWICGHRAYTKLP\ASSCVIGTI 

WPERIIQYYGPAT*AQDGSWGYRIPIYMINRIIRL 
QAVLKHTATGRALTILAQQETQMRNAIYQNRLA 
LDYLLAAEGEVCRKFNLTNCCLHIDNQGQVVED 
IVRDMTKVAHVPVQVWHGFDPGAMFRKWFPAL 
GGFKTLIIRVIIVIGTYLLLPRLLPVLLQMIKSFIAT 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=GIycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L^Lcucine, M=Methioninc, 
N-Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possiblc nucleotide deletion, 
\=possible nucleotide insertion 





LVYQNASAQVYYINHY 

3415 

A 

455 

108 

NMSWRGRSTYRPRPRRSLQPPELIGAMLEPTDEE 
PKEEKPPTKSRNPTPDQKREDDSG/SAA*DFKWP 
EPGKPIFQGAMVRPKTGG/CGCEGGY*CQGEDS\P 
KAEHFKMPEAGEGKSQV 

3416 

A 

1 

874 

FFFFQRJNFIEHSGSVSLLALACDLGWCEDWSCC 

LVQGGGDLVDVVQTNHGEDEAGGDTDSVDEAR 

CKESQQEAQENLREDLCLESFAKDKILQIIEGSER 

EHEETRTKQAALDGEPLGGGQLTAVHLHPSKEQ 

QGQEGGERQRGARTHHWRGWEKGRRVRLRPPS 

GKLRADQPVRKLGGPTPS/TELPGLQPHAPTPHT 

A/PATPTYSPAPDTPNPPVRWKCPLPVEPRTRQLC 

RERTRKACPPKPRPPLGLPGDPTGPVTHHAPPVS 

PTGASGQERRAEPGAVSYAHASATK 

3417 

A 

243 

847 

CLKYMYTYIFCPNCVSYKMKTDHFSLRYLHSSC 

AEDNKSSVDSSGQAAHPSKGKFFPHGTHWGTQC 

RGHISVLGWQCSCPSTGCRVGLGLAMCQTHAYI 

HTHTHTHTHTPTDYGAHHTDPLQRWGLGPR\KS 

EAGPLPQLSRDQSHPGPLSPGASPRSAGLPGWHP 

AHQEPRA RGRC ARDG LSLQTRLTNK YDIQCCQE 

MRK 

3418 

A 

4073 

1000 

LDEYEARLTLANLDDFEEDNEDDDENRVNQEEK 

AAKITELINKLNFLDEAEKDLATVNSNPFDDPDA 

AELNPFG DPDSEEPITETASPRKTEDSFYNNS YNP 

FKEVQTPQYLNPFDEPEAFVTIKDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNVVQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIE1DTNEEIPEGFVV 

GGGDELTmENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPWFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLNKGFKDSxSQYVVGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRNTEEEEAMMQEWFML 

VNKKNAL1RRMNQLSLLEKEHDLERRYELLNRE 

T RAMI ATPHWOlf TPAfWRPPfM 1 T FiPT VAT VTJ 

KIUDALVRDLDAQEKQAEEEDEI^ERTLEQNKG 
KMAKKEEKCVLQ 

3419 

A 

4073 

1000 

LDEYEA1U.TLANLDDFEEDNEDDDENRVNQEEK 
AAKITELMKLNFLDEAEKDLATVNSNPFDDPDA 
AELNPFGDPDSEEPITETASPRKTEDSFYNNSYNP 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=^Serinc, 
T=Thrconinc, V^aline, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
V=possibIe nucleotide insertion 





FKEVQTPQYLNPFDEPEAFVT1KDSPPQSTKRKNI 

RPVDMSKYLYADSSKTEEEELDESNPFYEPKSTP 

PPNNLVNPVQELETERRVKRKAPAPPVLSPKTGV 

LNENTVSAGKDLSTSPKPSPIPSPVLGRKPNASQS 

LLVWCKEVTKNYRGVKITNFTTSWRNGLSFCAI 

LHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 

SRLLEPSDMVLLAIPDKLTVMTYLYQIRAHFSGQ 

ELNVVQIEENSSKSTYKVGNYETDTNSSVDQEKF 

YAELSDLKREPELQQPISGAVDFLSQDDSVFVND 

SGVGESESEHQTPDDHLSPSTASPYCRRTKSDTEP 

QKSQQSSGRTSGSDDPGICSNTDSTQAQVLLGKK 

RLLKAETLELSDLYVSDKKKDMSPPFICEETDEQ 

KLQTLDIGSNLEKEKLENSRSLECRSDPESPIKKT 

SLSPTSKLGYSYSRDLDLAKKKHASLRQTESDPD 

ADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 

LEQARRDAALKAGNKHNTNTATPFCNRQLSDQ 

QDEERRRQLRERARQLIAEARSGVKMSELPSYGE 

MAAEKLKERSKASGDENDNIEIDTNEEIPEGFVV 

GGGDELTNLENDLDTPEQNSKLVDLKLKKLLEV 

QPQVANSPSSAAQKAVTESSEQDMKSGTEDLRT 

ERLQKTTERFRNPVVFSKDSTVRKTQLQSFSQYI 

ENRPEMKRQRSIQEDTKKGNEEKAAITETQRKPS 

EDEVLNKGFKDS\SQYVVGELAALENEQKQIDTR 

AALVEKRLRYLMDTGRMTEEEEAMMQEWFML 

VNKKNALIRRMNQLSLLEKEHDLERRYELLNRE 

LRAMLAIEDWQKTEAQKRREQLLLDELVALVN 

KRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 

KMAKKJbEKCVLQ 

3420 

A 

612 

1058 

ENLGPNYSHRLLHHPTFYKKIHKKHHEWTAPIG 

VISLYAHPIEHAVSNMLPVIVGPLVMGSHLSSITM 

WFSLALIITTISHCGYHLPFLPSPEFHDYHHLKFN 

QCYGVLGVLDHLHGTDTMFKQTKAYERHVLLL 

GFTPLSESIPDSPK 

3421 

A 

23 

2005 

LLTPCDGRIPGRPSVGAESGSDFQQRRRRRRDPE 

EPEKTELSERELAVAVAVSQENDEENEERWVGP 

LPVEATLAKKRKVLEFERVYLDNLPSASMYERS 

YMHRDVITHVVCTKTDFIITASHDGHVKFWKKIE 

EGIEFVKHFRSHLGVIESIAVSSEGALFCSVGDDK 

AMKVFDVVNFDMINMLKLGYFPGQCEWIYCPG 

DAISSVAASEKSTGKIFIYDGRGDNQPLHIFDKLH 

TSPLTQIRLNPVYKAVVSSDKSGMIEYWTGPPHE 

YKFPKNVNWEYKTDTDLYEFAKCKAYPTSVCFS 

PDGKKIATIGSDRKVRJFRFVTGKLMRVFDESLS 

MFTELQQMRQQLPDMEFGRRMAVERELEKVDA 

VRLINIVFDETGHFVLYGTMLGIKVINVETNRCV 

RILGKQENIRVMQLALFQGIAKKHRAATTffiMKA 

SENPVLQNIQADPTIVCTSFKKM^ 

DTKSADSDRDVFNEKPbKbbVMAATQAbGPKJKV 

SDSAIIHTSMGDIHTKLFPVECPKTVENFCVHSRN 

nwwriPTPiro wv. nFMTOTnnpTnTnMnnP^iwn 

UU iNvjrl i r nlviiixVjrivii^ i \jur i vj loivioojuoi w u 
GEFEDEFHSTLRHDRPYTLSMANAGSNTNGSQFF 
ITVVPTPWLDNKHTVFGRVTKGMEVVQRISMVK 
VNPKTDKPYEDVSIINITVK 

3422 

A 

2486 

433 

FVLVCAPLTWAGARHRRMAASKKPPRVRVNHQ 
DFQLRNLRIIEPNEVTHSGDTGVETDGRMPPKVT 


329 


WO 01/57190 


PCT/USO 1/04098 


SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F— Phenylalanine, G~Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=lfnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





SELLRQLRQAMRNSEYVTEPIQAYIIPSGDAHQSE 

YIAPCDCRRAFVSGFDGSAGTA1ITEEHAAMWTD 

GRYFLQAAKQMDSNWTLMKMGLKDTPTQEDW 

LVSVLPEGSRVGVDPLIIPTDYWKKMAKVLRSA 

G HHLIP VBCENL VDKI WTDRPERPCKPLLTLGLDY 

TGISWKDKVADLRLKMAERNVMWFVVTALDE1 

AWLFNLRGSDVEHNPVFFSYAIIGLETIMLFIDGD 

RIDAPSVKEHLLLDLGLEAEYRIQVHPYKSILSEL 

KALCADLSPREKVWVSDKASYAVSETIPKDHRC 

CMPYTPICLAKA\VKNSA\ESEGMRRAHIKDAVAL 

CELFNWLEKEVPKGGVTEISAADKAEEFRRQQA 

DFVDLSFPTISSTGPNGAIIHYAPVPETNRTLSLDE 

VYLIDSGAQYKDGTTDVTRTMHFGTPTAYEKEC 

FTYVLKGHIAVSAAVFPTGTKGHLLDSFARSAL 

WDSGLDYLHGTGHGVGSFLNVHEGPCGISYKTF 

SDEPLEAGMIVTDEPGYYEDGAFGIRIENVVLVV 

PVKTKYNFNNRGSLTFEPLTLVPIQTKMIDVDSL 

TDKECDWLNNYHLTCRDVIGKELQKQGRQEAL 

EWLIRETQPISKQH 

3423 

A 

5515 

934 

FKMPENPATDKLQVLQVLDRLKMKLQEKGDTS 

QNEKLSMFYETLKSPLFNQILTLQQSIKQLKGQL 

NHIPSDCSANFDFSRKGLLVFTDGSITNGNVHRPS 

NNSTVSGLFPWTPKLGNEDFNSVIQQMAQGRQIE 

YIDIERPSTGGLGFSVVALRSQNLGKVDIFVKDV 

QPGSVADRDQRLKENDQILArNHTPLDQNISHQQ 

AIALLQQTTGSLRLIVAREPVHTKSSTSSSLNDTT 

LPETVCWGHVEEVELINDGSGLGFGIVGGKTSGV 

WRTIVPGGLADRDGRLQTGDHILKIGGTNVQG 

MTSEQVAQVLRNCGNSVRMLVARDPAGDISVTP 

PAPAALPVALPTVASKGPGSDSSLFETYNVELVR 

KDGQSLGIRIVGYVGTSHTGEASGIYVKSIIPGSA 

AYHNGHIQVNDKIVAVDGVNIQGFANHDVVEVL 

RNAGQVVHLTLVRRKTSSSTSPLEPPSDRGTWE 

PLKPPALFLTGAVETETNVDGEDEEIKERIDTLKN 

DNIQALEKLEKVPDSPENELKSRWENLLGPDYEV 

MVATLDTQIADDAELQKYSKLLPIHTLRLGVEV 

DSFDGHHYISSIVSGGPVDTLGLLQPEDELLEVN 

GMQLYGKSRREAVSFLKEVPPPFTLVCCRRLFDD 

EASVDEPRRTETSLPETEVDHNMDVNTEEDDDG 

ELALWSPEVKIVELVKDCKGLGFSILDYQDPLDP 

TRSVIVIRSLVADGVAERSGGLLPGDRLVSVNEY 

CLDNTSLAEAVEILKAVPPGLVHLGICKPLVEDN 

EEESCYILHSSSNEDKTEFSGT1HDINSSLILEAPK 

GFRDEPYFKEELVDEPFLDLGKSFHSQQKEIEQS 

KEAWEMHEFLTPRLQEMDEEREMLVDEEYELY 

QDPSPSMEL YPLSHIQE ATP VPS VNELHFGTQ WL 

HDNEPSESQEARTGRTVYSQEAQPYGYCPENVM 

KENFVMESLPSVPSTEGNSQQGRFDDLENLNSLA 

KTSLDLGMIPNDVQGPSLLIDLPVVAQRREQEDL 

PLYQHQATRVISKASAYTGMLSSRYATDTCELPE 

REEGEGEETPNFSHWGPPRIVEIFREPNVSLGISIV 

GGQTVIKRLKNGEELKGIFIKQVLEDSPAGKTNA 

LKTGDKILEVSGVDLQNASHSEAVEAIKNAGNP 

VVFIVQSLSSTPRVIPNVUNKANKITGNQNQDTQ 

EKKEKRQGTAPPPMKLPPPYKALTDDSDENEEE 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
^Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, PHProline, Q^GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





DAFTDQK1RQRYADLPGELH1IELEKDKNGLGLS 

LAGNKDRSRMSIFVVG1NPEGPAAADGRMHIGD 

ELLEINNQILYGRSHQNXASAIIKTAPSKVKLVFIR 

NEDAVNQMAVTPFPVPSSSPSSIEDQSGTEPISSEE 

\DGSLEWGIKQLPESESFKLAVSQMKQQKYPTKV 

SFSSQEIPLAPASSYHSTDADFTGYGGFQAPLSVD 

PATCPIVPGQEMIIEISKRRSGLGLSIVGGKDTPLV 

NGVDLRNSSHEEAITALRQTPQKVRLVVYRDEA 

HYRDEENLEIFPVDLQKKAGRGLGLSIVGKR 

3424 

A 

2223 

1162 

HASERVVQLPDFVWDQYTHSLGRVEREFKNRKR 

HTRRVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKP\GTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDES1PIMYRN 

LPEYKELLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPP\EMSLVDFC\DS 

C\SDCLHET\DIHKGDHQLEP1YRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 

3425 

A 

2223 

1162 

HASERVVQLPDFVWDQYTHSLGRVEREFKNRKR 

HTRUVKLVFDKGLPARPKSPLDPKKDGESLSYS 

MLPLSDGPEGSSSRPQMIRGRLCDDTKPETFNQL 

WTVEEQKKLEQLLIKYPPEEVESRRWQKIADELG 

NRTAKQVASRVQKYFIKLTKAGIPVPGRTPNLYI 

YSKKSSTSRRQHPLNKHLFKPXGTFMTSHEPPVY 

MDEDDDRSCFHSHMNTAVEDASDDESIPIMYRN 

LPEYK£LLQFKKLKKQKLQHMQAESGFVQHVGF 

KCDNCGIEPIQG\VRW\HCR\DCPP\EMSL\DFaDS 

QSDCLHET\DIHKGDHQLEPIYRS\ETFLDRDYCV 

SQGTSYNYLDPNYFPANR 

3426 

A 

2 

1553 

LFVVVHDDPRWGTPRYWLGALYRNQQSSPTAPP 

GLLPLEYFPAAPHCSHSRQWRCSQTHRIHHHPQ 

MLGPCRQEICGITMAAGTLYTYPENWRAFKALI 

AAQYSGAQVRVLSAPPHFHFGQTNRTPEFLRKFP 

AGKVPAFEGDDGFCVFESNAIAYYVSNEELRGST 

PEAAAQVVQWVSFADSDIVPPASTWVFPTLGIM 

HHNKQATENAKJEEVRRILGLLDAYLKTRTFLVG 

ERVTLADITWCTLLWLYKQVLEPSFRQAFPNTN 

RWFLTCINQPQFRA\VFGEVKLCEKMAQF\DAKK 

FAETQPKKDTPRKEKGSREEKQKPQAERKEEKK 

AAAPAPEEEMDECEQALAAEPKAKDPFAHLPKS 

TFVLDEFKRKYSNEDTLSVALPYFWEHFDKDGW 

SLWYSEYRFPEELTQTFMSCNLITGMFQRLDKLR 

KNAFASVILFGTNNSSSISGVWVFRGQELAFPLSP 

DWQVDYESYTWRKLDPGSEETQTLVREYFSWE 

GAFQHVGKAFNQGKIFK 

3427 

A 

755 

52 

TAARRRQKGTAARRRQKGTAARRRQKGTAARR 

RQKGTAARRRQKGTAARRRQKGTAARRRQKGT 

AARRRQKGTAARRRQKGTAARRRQKGi AARRK 

QKGLSNLDAAEWLPPKKG\GEKKKGPFLAINEV 

VTVREYPmiLKRJHGVGFKXRAPRALKElRKFAM 

KEMGTPDVRIDTRLNKAVWAKGIRNVPYRIRVR 

LSRKRNEDEDSPNKLYTLVTYVPVTTFI<^QW 

NVDEN 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Iso!eucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X=Unknovvn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 

3428 

A 

4 

1939 

LPLSLSFSEMPLPLLPMDLKGEPGPPGKPGPWGP 

PGPPGFPGKPGHGKPGLHGQPGPAGPPGFSRMG 

KAGPPGLPGNVGPPGQPGLRGEPGIRGDQGLRGP 

PGPPGLPGPSG1TIPGKPGAQGVPGPPGFQGEPGP 

QGEPGPPGDRGLKGDNGVGQPGLPGAPGQGGAP 

GPPGLPGPAGLGKPGLDGLPGAPGDKGESGPPG 

VPGPRGEPGAVGPKGPPGVDGVGVPGAAGLPGP 

QGPSGAKGEPGTRGPPGLIGPTGYGMPGLPGPKG 

DRGPAGVPGLLGDRGEPGEDGEPGEQGPQGLGG 

PPGLPGSAGLPGRRGPPGPKGEAGPGGPPGVPGI 

RGDQGPSGLAGKPGVPGERGLPGAHGPPGPTGP 

KGEPGFTGRPGGPGVAGALGQKGDLGLPGQPGL 

RGPSGIPGLQGPAGPIGPQGLPGLKGEPGLPGPPG 

EGRAGEPGTAGP\RGPPGVPGSPGITGPPG\LPGPP 

GAPGAFDETG1AGLHLPNGGVEGAVLGKGGKPQ 

FGLGELSAHATPAFTAVLTSPLPASGMPVKFDRT 

LYNGHSGYNPATGIFTCPVGGVYYFAYHVHVKG 

TNVWVALYKNNVPATYTYDEYKKGYLDQASG 

GAVLQLRPNDQVWVQMPSDQANGLYSTEYIHSS 

FSGFLLCPT 

3429 

A 

212 

1075 

EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRW 

AGPESLPPLPRSLIMDSPRAGTHQGPLDAETEVG 

ADRCTSTAYQEQRPQVEQVGKQAPLSPGLPAMG 

GPGPGPCEDPAGAGGAGAGGSEPLVTVTVQCAF 

TVALRARRGADLSSLRALLGQALPHQVAQLGQLS 

YLAPGEDGHWVPfPEEESLQRAWQDAAACPRGL 

QLQCRGAGGRPVLYQVVAQHSYSAQGPEDLGF 

RQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCFV 

VPAGPRMSGAPGRLPRSQQGDQP 

3430 

A 

799 

1989 

INKYINIRKKIKLLSPLPPLWSHLALLQASATKWV 

LTPAAFAGKLLSVFRQPLSSLWRSLVPLFCWLRA 

TFWLLATKRRKQQLVLRGPDETKEEEEDPPLPTT 

PTSVNYHFTRQCNYKCGFCFHTAKTSFVLPLEEA 

KRGLLLLK\EAG\LEKINFSGG\EPFLQDRGEYLGK 

LVRFCKVELRLPSVSI\VSNGSLIRERWFQNYG\E 

YLUlLAiDCUarDJbbVNLruuKu 

QKL\RRWCRDYRVPFKINSVINPF\NVEEDMTEQI 

KALNPVRWKVFQCLLIEGENCGEDA\LREAERFV 

IGDEEFERFLERHKEVSCLVPESNQKMKDSYLIL 

DEYMRFLNCRKGRKDPSKSILDVGVEEAIKFSGF 

DEKMFLKRGGKYIWSKADLKLDW 

3431 

A 

5468 

2146 

ACGFLPGRCHFSTFKQCQEWLSRLSRATARPAKP 

EDLFAFAYHAWCLGLTEEDQHTHLCQPGEHIRC 

RQEAELARMGFDLQNVWRVSHINSNYKLCPSYP 

QKLLVPVWITDKELENVASFRSWKRIPVVVYRH 

LRNGAAIARCSQPEISWWGWRNADDEYLVTSIA 

KACALDPGTRATGGSLSTGNNDTSEACDADFDS 

SLTACSGVESTAAPQKLLILDARSYTAAVANRAK 

GGGCECEEYYPNCEVVFMGMANIHAIRNSFQYL 

T? A VP90K>fPr>P9MWT <sAT ClVTI QVIV/TT V A 
rvr\ v ^ov^ivijr Ui din w i_<o/\l>jc*o i jvw i-»v^rUL»o v ivll^jv/v 

AVLVANTVDREGRPVLVHCSDGWDRTPQIVALA 

K1LLDPYYRTLEGFQVLVESDWLDFGHKFGDRC 

GHQENVEDQNEQCPVFLQWLDSVHQLLKQFPCL 

FEFNEAFLVKLVQHTYSCLYGTFLANNPC\EREK 

RMYK/RGTCSVWALLRAGNKNFHNFLYTPSSD 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





MVLHP VCH VRA LHL WTA V YLPASSPCTLGEEN 

MDLYLSPVAQSQEFSGRSLDRLPKTRSMDDLLS 

ACDTSSPLTRTSSDPNLNNHCQEVRVGLEPWHS 

NPEGSETSFVDSGVGGPQQTVGEVGLPPPLPSSQ 

KDYLSNKPFKSHKSCSPSYKLLNTAVPREMKSNT 

SDPEIKVLEETKGPAPDPSAQDELGRTLDGIGEPP 

EHCPETEAVSALSKVISNKCDGVCNFPESSQNSPT 

GTPQQAQPDSMLGVPSKCVLDHSLSTVCNPPSA 

ACQTPLDPSTDFXLNQDPSGSVAS1SHQEQLSSVP 

DLTHGEEDIGKRGNNRNGQLLENPRFGKMPLEL 

VRKPISQSQISEFSFLGSNWDSFQGMVTSFPSGEA 

TPRRLLSYGCCSKRPNSKQMRATGPCFGGQWAQ 

REGVKSPVCSSHSNGHCTGPGGKNQMWLSSHPK 

QVSSTKPVPLNCPSPVPPLYLDDDGLPFPTDVIQH 

RLRQIEAGYKQEVEQLRJKQVRELQMRLDIRH 

APPAEPPMDYEDDFTCLKESDGSDTEDFGSDHSE 

DCLSEASWEPVDKKETEVTRWVPDHMASHCYN 

CDCEFWLAKRRHHCRNCGNVFCAGCCHLKLPIP 

DQQLYDPVLVCNSCYEHIQVSRARELMSQQLKK 

PIATASS 

3432 

A 

36 

1873 

MTFFSSVADFIGLDPRIAAWLIDPSDATPSFEDLV 

EKYCEKSITVKVNSTYGNSSRNIVNQNVRENLKT 

LYRLTMDLCSKLKDYGLWQLFRTLELPLIPILAV 

MESHAIQVNKEEMEKTSALLGARLKELEQEAHF 

VAGERFLITSNNQLRE1LFGKLKLHLLSQRNSLPR 

TGLQKYPSTVSEALNALRDLHPLPKIILEYRQVH 

K1KSTFVDGLLACMKKGSISSTWNQTGTVTGRLS 

AKHPNIQGISKHPIQITTPKNFKGKEDKILTISPRA 

MFVSSKGHTFLAADFSQIELRILTHLSGDPELLKL 

FQESERDDVFSTLTSQWKDVPVEQVTHADREQT 

KKVVYAWYGAGKERLAACLGVPIQEAAQFLES 

FLQKYKKIKDFARAAIAQCHQTGCVVSIMGRJRR 

PLPRIHAHDQQLRAQAERQAVNFVVQGSAADLC 

1ST A X iTTTTI npOP A \ 7 A A C T TTI T* A T>T \ 7 A /^VTT ITM?T T T?T?\ TT~> 

KL AMIH VFTA V AA SHTLTARL V AQ1HDELLFE V E 

DPQIPECAALVRRTMESLEQVPLKVSLSAGRSWG 

HLVPLQEAWVALRQAHVALSLPATAWLPLGPLP 

APSPFIPCIFRLHFVCSPRQQWEERTGFQQSrVWPS 

PRSPALYAPGRJNPLGLGWPAIPWSKCLCKALKK 

K 

3433 

A 

1481 

476 

IPPKERAPGIRASCLAITAGARPTSYGRVGCEGDV 
RLSPVSPLLAPPDPRLASRWEGRSRMKGKKGIVA 
ASGSETEDEDSMDIPLDLSSSAGSGKRRRRGNLP 
KESVQILRDWLYEHRYNAYPSEQEKALLSQQTH 

I CTI / F v\//">XT 1 ll/"CrKT ADDDT 1 T>T\AVfT T> VT>/^V'TYD'NT/'"\l?*T v r 

LMLQVCN WrLNAKKKJLLr^ 11 

SRRGAKISETSSVESVMGIKNFMPALEETPFHSFT\ 

AGPNPTLGXRPLSAKP/SQSPGSVLARPSVICHTTV 

TAIE1U,SLSLSCQSVGCGQNT\DIQQIAT\RNLRDS 

SLMYPEDTCKSGPSTNTQSGLFNTPPPTPPDLNQ 

DFSGFQLLVDVALKRAAEMELQAKLTA 

J4i4 

A 

1 /zu 

1Z4J 

INOi V i r LjvjoJtv I NWAyUo/inErUOi lUiOrorunny 

VPALLRGEPRGGAAAGSFWKPLHQHSCGLRPPP/ 
PPD/RLSRLPGKTLSACDRENGARRPLLLGSTSFIP 
IGRRTYASAAEPVGSKAVLVTGCDSGFGFSLAKH 
LHSKGFLVFAGCLMKDKGHDGVKELDSLNSDRL 
RTVQLNVCSSEEVEKV/VGDCPLEPEGP\EKGMW 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine 1 R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib!e nucleotide deletion, 
V=possible nucleotide insertion 





GLVNNAGISTFGEVEFTSLETYKQVAEVNLWGT 
VRMTKSFT PT TRRAKGR VVNT^MI GRMANPAR 

SPYCITKFGVEAFSDCLRYEMYPLGVKVSVVEPG 
NF1AATSLYSPESIQAIAKKMWEELPEVVRKDYG 
KKYFDEKIAKMETYCSSGSTDTSPVIDAVTHALT 
ATTPYTRYHPMDYYWWLRMQIMTHLPGAISDM 
IY1R 

3435 

A 

842 

3595 

ENQQQMLVAKEQRLHFLKQQERRQQQSISENEK 

LQKLKERVEAQENKLKKIRAMRGQVDYSKIMN 

GNLSAEIERFSAMFQEKKQEVQTAILRVDQLSQQ 

LEDLKKGKJLNGFQSYNGKLTGPAAVELKRLYQE 

LQIRNQLNQEQNSKLQQQKELLNKRNMEVAMM 

DKRISELRERLYGKKIQACEKVFLNRVNGTSSPQ 

SPLSTSGRVAAVGPYIQVPSAGSFPVLGDPIKPQS 

LSIASNAAHGRSKSANDGNWPTLKQNSSSSVKP 

VQVAGADWKDPSVEGSVKQGTVSSQPVPFSALG 

PTEKPGIEIGKVPPPIPGVGKQLPPSYGTYPSPTPL 

GPGSTSSLERRKEGSLPRPSAGLPSRQRPTLLPAT 

GSTPQPGSSQQIQQRISVPPSPTYPPAGPPAFPAGD 

SKPELPLTVAIRPFLADKGSRPQSPRKGPQTVNSS 

SIYSMYLQQATPPKNYQPAAHSALNKSVKAVYG 

KPVLPSGSTSPSPLPFLHGSLSTGTPQPQPPSESTE 

KEPEQDGPAAPADGSTVESLPRPLSPTKLTPIVHS 

PLRYQSDADLEALRRKLANAPRPLKKRSSITEPE 

GPGGPNIQKLLYQRFNTLAGGMEGTPFYQPSPSQ 

DFMVTLADVDNGNTNANGNLEELPPAQPTAPLP 

AEPAPSSDANDNELPSPEPEELICPQTTHQTAEPA 

EDNNNNVATVPTTEQIPSPVAEAPSPGEEQVPPA 

PLPPASHPPATSTNKRTNLKKPNSERTGHGLRVR 

FNPI Af I I r>A^T FflFFOI VORTTYFVFnP<?KPNrVF 

GITPLHNAVCAGHHH1VKFLLDFGVNVNAADSD 
GWTPLHCAASCNSVHLCKQLVESGAAIFASTISD 
IETAADKCEEMEEGYIQCSQFLYGVQEKLGVMN 
KGVAYALWDYEAQNSDELSFHEGDALTILRKJKD 
E 

3436 

A 

3 

2604 

GSTHASEKMKTGRSALVVTDTGDMSVLNSPRHQ 

SCIMHVDMDCFFVSVGIRNRPDLKGKPVAVTSN 

RGTGRAPLRPGANPQLEWQYYQNKILKGKADIP 

DSSLWENPDSAQANGIDSVLSRAEIASCSYEARQ 

LGIKNGMFFGHAKQLCPNLQAVPYDFHAYKEVA 

QTLYETLAS\YTHNIEAVSCDEALVDITEILAETK 

LTPDEFANAVRMEIKDQTKCAASVGIGSNILLAR 

MATRKAKPDGQYHLKPEEVDDFIRGQLVTNLPG 

VGHSMESKLASLGIKTCGDLQYMTMAKLQKEF 

GPKTGQMLYRFCRGLDDRPVRTEKERKSVSAEI 

NYGIRFTQPKEAEAFLLSLSEEIQRRLEATGMKG 

KRLTLKIMVRKPGAPVETAKFGGHGICDNIARTV 

TLDQATDNAKIIGKAMLNMFHTMKLNISDMRGV 

GIHVNQLVPTNLNPSTCPSRPSVQSSHFPSGSYSV 

RDVFOVOKAKKSTEEEHKEVFRAAVDLEISSASR 

TCTFLPPFPAHLPTSPDTNKAESSGKWNGLHTPV 

SVQSRLNLSIEVPSPSQLDQSVLEALPPDLREQVE 

QVCAVQQAESHGDKKKEPVNGCNTGILPQPVGT 

VLLQIPEPQESNSDAGINLIALPAFSQVDPEVFAA 

LPAELQRELKAAYDQRQRQGENSTHQQSASASV 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





PKNPLLHLKAA V KJiKJKJ^ 

NNKLLNSPAKTLPGACGSPQKLIDGFLKHEGPPA 
EKPLEELSASTSGVPGLSSLQSDPAGCVRPPAPNL 
AGAVEFNDVKTLLREWITTISDPMEEDILQVVKY 
CTDLIEEKDLEKLDLVIKYMKRLMQQSVESVWN 
MAFDFILDNVQVVLQQTYGSTLKVT 

3437 

A 

32 

4038 

SLLRLLKAQWGSSGAASEPVVLGEEGCGFPSTNE 

YPDLEEERATYPQEEDRFLTPGRAQLLWSPWSPL 

DQEEACASRQLHSLASFSTVTARRNPLHNPWGM 

ELAASENTDSPSPRPLRPGVTLPPGALTMNTKDT 

TEVAENSHHLKJFLPKKLLECLPRCPLLPPERLRW 

NTNEEIASYLITFEKHDEWLSCAPKTRPQNGSIIL 

YNRKKVKYRKDGYLWKXRKDGKTTREDHMKL 

KVQGMECLYGCYVHSSIVPTFHRRCYWLLQNPD 

IVLVHYLNVPALEDCGKGCSPIFCSISSDRREWLK 

WSREELLGQLKPMFHGIKWSCGNGTEEFSVEHL 

VQQILDTHPTKPAPRTHACLCSGGLG SGSLTHKC 

SSTKHRIISPKVEPRALTLTSIPHPHPPEPPPLIAPLP 

PELPKAHTSPSSSSSSSSSGFAEPLEIRPSPPTSRGG 

SSRGGTAILLLTGLEQRAGGLTPTRHLAPQADPR 

PSMSLAVWGTEPSAPPAPPSPAFDPDRFLNSPQR 

GQTYGGGQGVSPDFPEAEAAHTPCSALEPAAAL 

EPQAAARGPPPQSVAGGRRGNCFFIQDDDSGEEL 

KGHGAAPPIPSPPPSPPPSPAPLEPSSRVGRGEALF 

GGPVGASELEPFSLSSFPDLMGELISDEAPSIPAPT 

PQLSPALSTITDFSPEWSYPEGGVKVLITGPWTEA 

AEHYSCVFDHIAVPASLVQPGVLRCYCPAHEVG 

LVSLQVAGREGPLSASVLFEYRARRFLSLPSTQL 

DWLSLDDNQFRMSILERLEQMEKRMAEIAAAGQ 

VPCQGPDAPPVQDEGQGPGFEARVVVLVESMIP 

RSTWKGPERLAHGSPFRGMSLLHLAAAQGYARL 

BETLSQWRSVETGSLDLEQEVDPLNVDHFSCTPL 

MWACALGHLEAAVLLFRWNRQALSIPDSLGRLP 

LSVAHSRGHVRLARCLEELQRQEPSVEPPFALSP 

PSSSPDTGLSSVSSPSELSDGTFSVTSAYSSAPDGS 

PPPAPLPASEMTMEDMAPGQLSSGVPEAPLLLM 

DYEATNSKGPLSSLPALPPASDDGAAPEDADSPQ 

AVDVIPVDMrSLAKQIIEATPERIKREDFVGLPEA 

GASMRERTGAVGLSETMSWLASYIAENVDHFPS 

STPPSEL\PFER\GRLGLSLTAPSWAEFLSCIPPVGK 

IGKLlr ALL 1 LVSDXQ KbL Y bAAK V IKl 1 Ar KK Y Jv 

GRRLKEQQEVAAAVIQRCYRKYKQLTWIALKFA 

LYKKMTQAAJLIQSKFRSYYEQKRFQQSRRAAV 

LIQQHYRSYRRRPGPPHRTSATLPARNKGSFLTK 

KQDQAARKIMRFLRRCRHRMRELKQNQELEGLP 

QPGLAT 

3438 

A 

469 

2602 

FGRLLWGTAFKSWKMKAPIPHLILLYATFTQSLK 

VVTKRGSADGCTDWSIDIKKYQVLVGEPVRIKC 

ALFYGYIRTNYSLAQSAGLSLMWYKSSGPGDFE 

NSTYCMKVSISLTVGENDTGLCYNSKMKYFEKA 

ELSKSKEISCRDIEDFLLPTREPEILWYKECRTKT 

WRPSIVFKRDTLLIREVREDDIGNYTCELKYGGF 

VVRRTTELTVTAPLTDKPPKLLYPMESKLTIQET 

QLGDSANLTCRAFFGYSGDVSPLIYWMKGEKFIE 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methioninc, 
N=Asparaginc, P^Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\= possible nucleotide insertion 





DLDENRVWESDI\KJLKEHLGEQEVSISLIVDSVEE 

GDLGNYSCYVENGNGRRHASVLLHKRELMYTV 

ELAGGLGA ILLLL VCLVTI YKC YKJEIMLF YRNHF 

GAEELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGTYI 

EDVARCVDQSKRLIIVMTPNYVVRRGWSIFELET 

RLKNMLVTGEIKV1LIECSELRGI^4NYQEVEALK 

HTIKLLTVIKWHGPKCNKLNSKFWKRLQYEMPF 

KR1EPITHEQALDVSEQGPFGELQTVSAISMAAAT 

STALATAHPDLRSTFHNTYHSQMRQKHYYRSYE 

YDVPPTGTLPLTSIGNQHTYCNIPMTLINGQRPQT 

KSSREQNPDEAHTNSAILPLLPRETSISSVIW 

3439 

A 

251 

2037 

GPGNSSILIGGGHLFLIRSCLNLLLLNSKENTEHT 

MAKKVAVIGAGVSGLSSIKCCVDEDLEPTCFERS 

DDIGGLWKFTERGSSLSVMIWPLALSLLRHGGFC 

YSDFPFHEDYPNFMNHEKFWDYLQEFAEHFDLL 

KYIQFKTTVCGITKRPDFSETGQWDVVTETEGKQ 

NRAVFDAVMVCTGHFLNPHLPLEAFPGIHKFKG 

QILHSQEYKIPEGFQGKRVLVIGLGNTGGD1AVEL 

SRTAAQVLLSTRTGTWVLGRSSDWGYPYNMMV 

TRRCCSFIAQVLPSRFLNW1QERKLNKRFNHEDY 

GLSITKGKKAKFIVNDELPNCILCGAITMKTSVIE 

FTETSAVFEDGTVEENIDVVIFTTGYTFSFPFFEEP 

LKSLCTKKIFLYKQVFPLNLERATLA1IGLIGLKGS 

ILSGTELQARWVTRVFKGLCKRPASQKLMMEAT 

EKEQLIKRGVFKDTSKDKFDY1AYMDDIAACIGT 

KPSIPLLFLKDPRLAWEVFFGPCTPYQYRVLMGPG 

KWDGARNAILTQWDRTLKPLKTRJVPDSSKAWP 

SM\SHYLKAWGAPVLLASLLLTCK\SSLFLKLVRD 

KLQDRMSPYLVSLWRG 

3440 

A 

1 

3533 

IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVM 

ENSKVLGESMAGISQNAKTGDLPAFGECVGIASK 

ALCGLTEAAAQAAYLVGIFDPNSQAGHQGLVDP 

IQFARANQAIQMACQNLVDPGSSPSQVLSAATIV 

AKHTSALCNACR1ASSKTANPVAKRHFVQSAKE 

VANSTANLVKTIKALDGDFSEDNRNKCRIATAPL 

IEAVENLTAFASNPEFVSIPAQISSEGSQAQEPILV 

SAKPMLESSSYLIRTARSLAINPKDPPTWSVLAG 

HSHTVSDSIKSLITSIRDKAPGQRECDYSIDGINRC 

IRDIEQASLAAVSQSLATRDDISVEALQEQLTSVV 

QEIGHLIDPIATAARGEAAQLGHKGTQLASYFEP 

LILAAVGVASKILDHQQQMTVLDQTKTLAESAL 

QMLYAAK^GGGNPKAQHTHDAITEAAQLMKEA 

VDDIMVTLNEAASEVGLVGGMVDAIAEAMSKL 

DEGTPPEPKGTFVDYQTTVVKYSKA1AVTAQEM 

MTKSVTNPEELGGLASQMTSDYGHLAFQGQMA 

AATAEPEEIGFQIRTRVQDLGHGCIFLVQKAGVAL 

QVCPTDSYTKRELIECARAVTEKVSLVLSALQAG 

NKGTQACITAATAVSGIIADLDTTIMFATAGTLN 

AbNob 1 r AJJHKbNlLK 1 AKAL VbD 1 KLL VoOAAb 

TPDKLAQAAQSSAATUQLAEWKLGAASLGSD 

DPETQVVLINAIKDVAKALSDLISATKGAASKPV 

DDPSMYQLKGAAKVMVTNVTSLLKTVKAVEDE 

ATRGTRALEATIECIKQELTVFQSKDVPEKTSSPE 

ESIRMTKGITMATAKAVAAGNSCRQEDV1ATAN 
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SEQ1D 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Asparric Acid, 
E=Glutamic Acid, F=Phenylalaninc, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S==Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop cod o n, /= possible nucleotide deletion, 
Y=possible nucleotide insertion 





LSRKAVSDMLTACKQASFHPDVSDEVRTRALRF 

GTECTLGYLDLLEHVLVILQKPTPELKQQLAAFS 

KRVAGAVTELIQAAEAMKGTEWVDPEDPTVIAE 

TELLGAAASIEAAAKKLEQLKPRAKPKQADETL 

DFEEQILEAAKSIAAATSALVKSASAAQRELVAQ 

GKVGS1PANAADDGQWSQGLISAARMVAAA TSS 

LCEAANASVQGHASEEKLISSAKQVAASTAQLL 

VACKVKADQDSEAMRRLQAAGNAVKRASDNL 

VRAAQKAAFGKADDDDVVVKTKFVGGIAQIIAA 

QEEMLKKEl^LEEARKKLAQIRQQQYKFLPTEL 

REDEG 

3441 

A 

3 

1584 

NSARGGVGVRGARAMATVQEKAAALNLSALHS 

PAHRPPGFSVAQKPFGATYVWSSIINTLQTQVEV 

KKRJUmLKRHNDCFVGSEAVDVIFSHLIQNKYF 

GDVDIPRAKVVRVCQALMDYKVFEAVPTKVFG 

KI)KKPTFEDSSCSLYRFTTIPNQDSQLGKENKLY 

SPARYADALFKSSDIRSASLEDLWENLSLKPANS 

PHVNISTTLSPQVINEVWQEETIGRLLQLVDLPLL 

DSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGIL 

KAYSDSQEDEWLSAAIDCLEYLPDQMVVEISRSF 

PEQPDRTDLVKELLFDAIGRYYSSREPLLNHLSD 

VHNG I AELL VNGKTE1 ALEATQLLLKLLDFQNRE 

EFRRLLYFMAVAANPSEFKLQKESDNRMVVKRI 

FSKj\IVDNKNLSKGKTDLLVLFL\MDHQKDVFKI 

PGTL\HKIVS\VK\LMAIQNGRDPNRDAGY1YCQRI 

DQRDYSNITEKTTIDELLYLLKTLDEDSKLSAKE 

KKKVLLGQFYKCHPDIFIErlFGD 

3442 

A 

160 

822 

SPASGHCRLNGAAVAMFGCLVAGRLVQTAAQQ 

VAEDKPVFDLPDYESlTvIHVVVFMLGTIPFPEGMG 

GSVYFSYPDSNGMPVWQLLGFVTNGKPSAJFK1S 

GLKSGEGSQHPFGAMNIVRTPSVAQIGISVELLDS 

MAQQTPVGNAAVSSVDSFTQFTQKMLDNFYNF 

ASSFAVSQ/VPDDTQ/RPSEMFIPANVVLKWYENF 

QlxRTSTEPSLLENOWIKINF 

3443 

A 

3 

1373 

SWHVRRRWLEATMAGGMKVAVSPAVGPGPWG 

SGVGGGGTVRLLLILSGCLVYGTAETDVNVVML 

QESQVCEKJIASQQFCYTNVLIPQWHDIWTRIQIR 

VNSSl^VRVTQVENEEKLKELEQFSrVVNFFSSFL 

KEKLNDTYVNVGLYSTKTCLKVEIffiKDTKYSVl 

VIRRFDPKLFLVFLLGLMLFFCGDLLSRSQIFYYS 

TGMTVGlVASLVLinFILSKFMPKXSPIYVILVGGW 

SFSLYLIQLVFKNLQEIWRCYWQYLLSYVLTVGF 

MSFAVCYKYGPLfcNbRSlNLL 1 W 1 LQLMuJLCrM 

YSGIQIPHIALAIIHALCTKNLEHPIQWLYITCRKV 

CKGAEKPVPPRLLTEEEYRJQGEVETRKALEELR 

EFCNSPDCSAWKTVSRIQSPKRFADFVEGSSHLT 

PNEVSVHEQEYGLGSHAQDEIYEEASSEEEDSYS 

RCPAITQNNFLT 

3444 

A 

566 

1718 

KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

bru\uoLUWVLrlN 1 AMiSJMvVLLMOKoUoUlvl o 

MRSHFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMEKYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDlVfflYYQSCLEAILQNSPDAKIFCLVHKMD 

LVQEDQRDLIFKEREEDLRRLSRPLECSCFRTSIW 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=His(idine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAF1DIFTSN 

TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 

3445 

A 

566 

1718 

KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSIIFANYIARDTRRLGATILDRIHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

LVOEDORDLTFKFRFFDT RRT SRPI FP^rFRT^TW 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAE11E 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSMEVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 

3446 

A 

566 

1718 

KGLERTCCAMEESDSEKTTEKENLGPRMDPPLG 

EPG\GSLGWVLPNTAMKKKVLLMGKSGSGKTS 

MRSnFANYIARDTRRLGATILDRJHSLQINSSLST 

YSLVDSVGNTKTFDVEHSHVRFLGNLVLNLWDC 

GGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESR 

ELEKDMHYYQSCLEAILQNSPDAKIFCLVHKMD 

T VOFDORni TFIfFRPFm RRI <5RPT PP^PFPTQTW 
Lj v v^cjL^v^i\J^JjirivjLLivE(i^ l oi w 

DETLYKAWSSIVYQLIPNVQQLEMNLRNFAEIIE 

ADEVLLFERATFLVISHYQCKEQRDAHRFEKISNI 

IKQFKLSCSKLAASFQSN4EVRNSNFAAFIDIFTSN 

TYVMVVMSDPSIPSAATLINIRNARKHFEKLERV 

DGPKQCLLMR 

3447 

A 

1 

2930 

VLLGPLWDKLSTADHPVIVTMASKRKSTTPCMIP 

VKTVVLQDASMEAQPAETLPEGPQQDLPPEASA 

ASSEAAQNPSSTDGSTLANGHRSTLDGYLYSCK 

YCDFRSHDMTQFVGHMNSEHTDFNKDPTFVCSG 

CSFLAKTPEGLSLHNATCHSGEASFVWNVAKPD 

NHVVVEQSIPESTSTPDLAGEPSAEGADGQAEIUT 

KTPIMKIMKGKAEAKKIHTLKENVPSQPVGEALP 

KLSTGEMEVREGDHSFINGAVPVRQASASSAKN 

PHAANGPLIGTVPVLPAGIAQFLSLQQQPPVHAQ 

HHVHQPLPTAKALPKVMIPLSSIPTYSAAMDSNS 

FLKNSFHKPPYPTKAELCYLTVVTKYPEEQLKIW 

FTAQRLKQGISWSPEEIEDARKKMFNTVIQSVPQ 

PTITVLNTPLVASAGNVQHLIQAALPGHVVGQPE 

GTGGGLLVTQPLMANGLQATSSPLPLTVTSVPK 

QPGVAPINTVCSNTTSAVKVVNAAQSLLTACPSI 

TSQAFLDASIYKNKKSHEQLSALKGSFCRNQFPG 

QSEVEHLTKVTGLSTREVRKWFSDRRYHCRNLK 

GSRAMIPGDHRSIIIDSVPEVSFSPSSKVPEVTCIPT 

TATLATHPSAKRQSWHQTPDFTPTKYKERAPEQ 

LRALESSFAQNPLPLDEELDRLRSETKMTRREIDS 

WSERRKKVNAEETKKAEENASQEEEEAAEDEG 

GEEDLASELRVSGENGSLEMPSSHILAERKVSPIK 

INLKNLRVTEANGRNEIPGLGACDPEDDESNKLA 

EQLPGKVSCKKTAQQRHLLRQLFVQTQWPSNQD 

YDSIMAQTGLPRPEVVRWFGDSRYALKNGQLK 

WYEDYKRGNFPPGLLVIAPGNRELLQDYYMTHK 
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SEQ1D 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Mclhionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 





MLYEEDLQNLCDKTQMSSQQVKQWFAEKMGEE 

TRAVADTGSEDQGPGTGELTAVHKGMGDTYSE 

VSENSESWEPRVPEASSEPFDYTSSPQAGRQLETD 

3448 

A 

2 

1324 

FVARAEKGFRTREAHLLQVAGVGTGLQNGASLS 

GLASGVMAQRAFPNPYADYNKSLAEGYFDAAG 

RLTPEFSQRLTNKIRELLQQMERGLKSADPRDGT 

GYTGWAGIAVLYLHLYDVFGDPAYLQLAHGYV 

KQSLNCLTKRSITFLCGDAGPLAVAAVLYHKMN 

NEKQAEDCITRLIHLNKIDPHAPNEMLYGR1GYIY 

ALLFVNKNFGVEKIPQSHIQQICETILTSGENLAR 

KRNFTAKSPLMYEWYQEYYVGAAHGLAGIYYY 

LlVl^r oLl^ V oy Lj JvLrioL. V ivr o V D Y V Ov^L»Kr r oOfN 

YPPCIGDNRDLLVHWCHGAPGVIYMLIQAYKVF 

R/EREKYLC\DAYQCADVIWQYGLLKKGYGLCY\ 

G S AGN A YAFLTLYNLTQDMKYLYRACKF AE WC 

LEYGEHGCRTPDTPFSLFEGMAGTIYFLVADLLFP 

TKARVFPAFEL 

3449 

A 

3 

2389 

SRHVTGAARSPSRAGPSDPPAMGDEDDDESCAV 

ELRITEANLTGHEEKVSVENFELLKVLGTGAYGK 

VFLVRKAGGHDAGKLYAMKVLRKAALVQRAK 

TQEHTRTERSVLELVRQAPFLVTLHYAFQTDAKL 

HLILDYVSGGEMFTHLYQRQYFKEAEVRVYGGE 

IVLALEHLHKLGIIYRDLKLENVLLDSEGHIVLTD 

FGLSKEFLTEEKERTFSFCGTIEYMAPEIIRSKTGH 

GKAVDWWSLGILLFELLTGASPFTLEGERNTQAE 

VSRRILKCSPPFPPRIGPVAQDLLQRLLCKDPKKR 

LGAGPQGAQEVRNHPFFQGLDWVALAARKIPAP 

FRPQIRSELDVG\NFAEEFTRLEPVYSPPGQ\PPPG 

DPRIFQGYSFVAPSILFDHNNAVMTDGLEAPGAG 

DRPGRAAVARSAMMQDSPFFQQYELDLREPALG 

QGSFSVCRRCRQRQSGQEFAVKILSRRLEANTQR 

EVAALRLCQSHPNVVNLHEVHHDQLHTYLVLEL 

LRGGELLEHIRKKRHFSESEASQILRSLVSAVSFM 

HEEAGWHRDLKPENILYADDTPGAPVKIIDFG/F 

SPRLRPQSPGVPMQTPSFTLQYAAPELLAQQGYD 

QAAEIMCKIREGRFSLDGEAWQGVSEEAKELVR 

GLLTVDPAKRLKLEGLRGSSWLQDGSARSSPPLR 

TPDVLESSGPAVRSGLNATFMAFNRGKREGFFLK 

SVENAPLAKRRKQKLRSATASRRGSPAPANPGR 

APVASKGAPRRANGPLPPS 

3450 

A 

201 

1705 

KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDS 

EDRSDSRAAQPAHDSGHGDDESPSTSSGTAGTSS 

VPELPGFYFDPEKKRYFRLLPGHNNCNPLTKESIR 

QKEMESKRLRLLQEEDRRKKIARMGFNASSMLR 

KSQLGFLNVTNYCHLAHELRLSCMERKKVQIRS 

MDPSALASDRFNLILADTNSDRLFTVNDVTVGGS 

KYGIINLQSLKTPTLKVFMHENLYFTNRKVXNSV 

CWASLNHLDSHILLCLMGLAETPGCATLLPASLF 

VNSHPAGTDRPGXMT P^FRTPriAW^rAW^T NIOA 

NNCFSTGLSRRVLLTNVVTGHRQSFGTNSDVLA 

QQFALMAPLLFNGCRSGEIFADDLRCGNQGKGW 

KATRLFHDSAVTSVRILQDEQYLMASDMAGKIK 

LWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEGI 

LVAVGQDCYTRIWSLHDARLLRTIPSPYPASKAD 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine t 
]=lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P= Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





IPSVAFSSRLGGSRGAPGLLMAVGQDLYCYSYS 

3451 

A 

19 

6033 

LLSAMLSHGAGLALWITLSLLQTGLAEPERCNFT 

LAESKASSHSVSIQWRILGSPCNFSL1YSSDTLGA 

ALCPTFRIDNTTYGCNLQDLQAGTIYNFK1ISLDE 

ERTVVLQTDPLPPARFGVSKEKTTSTGLHVWWT 

PSSGKVTSYEVQLFDENNQKIQGVQIQESTSWNE 

YTFFNLTAGSKYNIAITAVSGGKRSFSVYTNGST 

VPSPVKDIGISTKANSLLISWSHGSGNVERYRLM 

LMDKGILVHGGVVDKHATSYAFHGLSPGYLYNL 

TVMTEAAGLQNYRWKLVRTAPMEVSNLKVTND 

GSLTSLKVKWQRPPGVNVDSYNITLSHKGTIKESR 

VLAPWITVETHFKELVPGRLYXQVTCSAVSLGELS 

AQKMVAVGRTFPDKVAM^EANNNGRMRSLVVS 

WSPPAGDWEQYRILLFNDSVVLLNITVGKEETQ 

YVMDGTGLVPGRQYEVEVIVESGNLKNSERCQG 

RTVPLAVLQLRVKHANETSLSIMWQTPVAEWEK 

YIISLADRDLLLIHKSLSKDAKEFTFTDLVPGRKY 

MATVTSISGDLKNSSSVKGRTVPAQVTDLHVAN 

QGMTSSLFTNWTQAQGDVEFYQVLLIHENVVIK 

NESISSETSRYSFHSLKSGSLYSVVVTTVSGGISSR 

QVVVEGRTVPSSVSGVTVNNSGRNDYLSVSWLL 

APGDVDNYEVTLSHDGKVVQSLVIAKSVRECSF 

SSLTPGRLYTVTITTRSGKYENHSFSQERTVPDKV 

QGVSVSNSARSDYLRVSWVHATGDFDHYEVT1K 

NKNNFIQTKS1PKSENECVFVQLVPGRLYSVTVT 

TKSGQYEANEQGNGRTIPEPVKDLTLRNRSTEDL 

HVTWSGANGDVDQYEIQLLFNDMJKVFPPFHLVN 

TATEYRFTSLTPGRQYKILVLTISGDVQQSAFIEG 

FTVPSAVKNIHISPNGATDSLTVNWTPGGGDVDS 

YTVSAFRHSQKVDSQTIPKHVFEHTFHRLEAGEQ 

YQIMIASVSGSLKNQINVVGRTVPASVQGVIADN 

AYSSYSLIVSWQKAAGVAERYDILLLTENGILLR 

NTSEPATTKQHKFEDLTPGKKYKIQILTVSGGLFS 

KEAQTEGRTVPA A VTDLRITENSTRHLSFRWTA S 

EGELSWYN1FLYNPDGNLQERAQVDPLVQSFSFQ 

NLLQGRMYKMVIVTHSGELSNESFIFGRTVPASV 

SHLRGS>niNTTDSLWFNWSPASGDFDFYELILYN 

PNGTKKENWKDKDLTEWRFQGLVPGRKYVLW 

VVTHSGDLSNKVTAESRTAPSPPSLMSFADIANT 

SLAITWKGPPDWTDYNDFELQWLPRDALTVFNP 

YNNRKSEGRJVYGLRPGRSYQFNVKTVSGDSWK 

TYSKPIFGSVRTKPDKIQNLHCRPQNSTAIACSWI 

PPDSDFDGYSIECRXMDTQEVEFSRKLEKEKSLL 

NIMMLVPHKRYLVSIKVQSAGMTSEVVEDSTIT 

MIDRPPPPPPHIRVlvJEKDVLISKSSINFTVNCSWFS 

DTNGAVKYFTWVREADGSDELKPEQQHPLPSY 

LEYRHNASIRVYQTNYPASKCAENPNSNSKSFNI 

KLGAEMESLGGKCDPTQQKFCDGPLKPHTAYRI 

SIRAFTQLFDEDLKEFTKPLYSDTFFSLPITTESEP 

LFGAIEGVSAGI FLTGMLVAWAI I TrttOTCV^Hfi 

RERPSARLSIRRDRPLSVHLNLGQKGNRKTSCPIK 
INQFEGHFMKLQADSNYLLSKEYEELKDVGRNQ 
SCDIALLPENRGKNRYNNELPYDATRVKLSNVDD 
DPCSDYINASYIPGNNFRREYIVTQGPLPGTKDDF 
WKMVWEQNVHNIVMVTQCVEKGRVKCDHYW 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid. ^Phenylalanine. G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L= Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 





PADQDSLYYGDLILQMLSESVLPEWTIilEFKICGE 

EQLDAHRLIRHFHYTVWPDHGVPETTQSLIQFVR 

TVRDYINRSPGAGPTVVHCSAGVGRTGTFIALDR 

ILQQLDSKDSVDIYGAVXHDLRLHRVHMVQTEC 

QYVYLHQCVRDVLRARKLRSEQENPLFPIYENV 

NPEYRRDPVYSRH 

3452 

A 

63 

1073 

FFRSSSDNGSPIRQYE/HSTPAHQGPVMGLEGKS/ 

ARNSQLRIVLVGKTGAGKSATGNSILGRKVFHSG 

TAAKSITKKCEKRSSSWKETELVVVDTPGIFDTE 

VPNAETSKEIIRCILLTSPGPHALLLVVPLGRYTEE 

ErlXATEKILKMFGERARSFMILIFTRKDDLGDTN 

LHDYLREAPEDIQDLMDIFGDRYCALNNKATGA 

EQEAQRAQLLGLIQRVVRENKEGCYTNRMYQR | 

AEEE1QKQTQAMQELHRVELEREKARIREEYEEK 

IRKLEDKVEQEKRKXQMEKKLAEQEAHYAVRQ 

QRARTEVESKDGILELIMTALQIASFILLRLFAED 

3453 

A 

2674 

514 

GPITFLKKXAKMKDMPLRIHVLLGLA1TTLVQAV 

DKKVDCPRLCTCEIRPWFTPRSIYMEASTVDCND 

LGLLTFPARLPANTQILLLQTNNIAKIEYSTDFPV 

NLTGLDLSQKNLSSVTNINGKKMPQLLSVYLEEN 

KLTELPEKCLSELSNLQELYINHNLLSTISPGAFIG 

Lrl>ILLRLl^NSNRLQMINSKWFDALPNLEILMIG 

ENPIIRJKJDMNFKPLINLRSLVIAGINLTEIPDNAL 

VGLENLESISFYDNRLIKVPHVALQKWNLKPLD 

LNKMWRIRRGDFSNMLHLKS^^ 

SLAVDNLPDLI^IEATNNPl^SYIHPNAFFRLPKL 

ESLMLNSNALSALYHGTIESLPNLKEISIHSNPIRC 

DCVIRWMNMNKTNIRFMEPDSLFCVDPPEFQGQ 

NVRQVHFRDMME1CLPLIAPESFPSNLNVEAGSY 

VSFHCRATAVEPQPEIYWITPSGQKLLPN'RLTDKF 

YVHSEGTLDINGVTPKEGGLYTCIATNLVGADLK 

SVMIKVDGSFPQDhR^GSLNIKJRDIQANSVLVSW 

KASSKILKSSVKWTAFVKTENSHAAQSARIPSDV 

KVYNLTHLNPSTEYKJCIDIPTIYQKNRKKCVNYT 

TKGLHPDQKEYEKNNTTTLMACLGGLLGIIGVIC 

LISCLSPEMNCDGGHSYVRNYLQKPTFALGELYP 

PLINLWEAGKEKSTSLKVKATVIGLPTNMS 

3454 

A 

1844 

244 

ERYLFATYVAPSATLDIGLQQEKJ<XEIYMKIQPP 

FEDLFDTAEEY1LLLLLEPWTKMVKSDQIAYKKV 

ELVEETRQLDSTYFRKLQALHKETFSKKAEDTTC 

EIGTGILSLSNVSKRTEYWDNVPAEYKHFKFSDL 

LNNKLEFEHFRQFLETHSSSMDLMCWTDIEQFRR 

ITYRDRNQRKAKSIYIKNKYLNKKYFFGPNSPAS 

LYQQNQVMHLSGGWGKILHEQLDAPVLVEIQK 

HVQNRLENVWLPLFLASEQFAARQKIKVQMKDI 

AEELLLQKAEKKIGVWKPVESKWISSSCKIIAFRK 

ALLNPVTSRQFQRFVALKGDLLENGLLFWQEVQ 

KYKDLCHSHCDESVIQKJKITTIIInCFINS 

DIPVEQAQKJIEHRKELGPYVFREAQMTFLGVMF 

IsJr W r\£F v^Jtir rvJSJNlv 1 LIJtilN 11VJ.O y l^iiJvlvvii 1 IN J\\^JSJv 

KLAVL/QNDEKSGKDGIKQYANTSVPAIKTALLS 

DSFLGLQPYGRQPTWCYSKYIEALEQERILLKIQE 

ELEK\SCLQACNLSQILRLALQLCL 

3455 

A 

228 

3330 

APTAQAMMSFGGADALLGAPFAPLHGGGSLHY 
ALARKGGAGGTRSAAGSSSGFHSWTRTSVSSVS 
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SEQ ID 

NO: 

tMethod 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E ^Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleuctne, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P~ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





ASPSRPRGAGAASSTDSLDTLSNGPEGCMVAVA 

TSRSEKEQLQALNDRFAGYIDKVRQLEAHNRSLE 

GEAAALRQQQAGRSAMGELYEREVREMRGAVL 

RLGAARGQLRLEQEHLLEDIAHVRQRLDDEARQ 

REEAEAAARALARFAQEAEAARVDLQKKAQAL 

QEECGYLRRHHQEEVGELLGQIQGSGAAQAQM 

QAETRDALKCDVTSALREIRAQLEGHAVQSTLQ 

SEEWFRVRLDRLSEAAKVNTDAMRSAQEEITEY 

RRQLQARTTELEALKSTKDSLERQRSELEDRHQA 

DIASYQEAIQQLDAELRNTKWEMAAQLREYQDL 

LNVKMALD1EIAAYRKLLEGEECR1GFGP1PFSLP 

EGLPKIPSVSTHIKVKSEEKIKVVEKSEKETVIVEE 

QTEETQVTEEVTEEEDKEAKEEEGKEEEGGEEEE 

AEGGEEETKSPPAEEAASPEKEAKSPVKEEAKSP 

AEAKSPEKEEAKSPAEVKSPEKAKSPAKEEAKSP 

PEVAKSPEKDGKQNFQAEVKSPEKAKSPAKEEAK 

SPAEAKSPEKAKSPVKEEAKSPAEAKSPVKEEAK 

SPAEVKSPEKAKSPTKEEVAKSPEKAKSPEKAKSP 

EKEEAKSPEKAKSPVKAEAKSPEKAKSPVKAEA 

KSPEKAKSPVKEEAKSPEKAKSPVKEEAKSPEKA 

KSPVKEEAKTPEKAKSPVKEEAKSPEKAKSPEKA 

KTLDVKSPEAKTPAKEEARSPADKFPEKAKSPVK 

EEVKSPEKAKSPLKEDAKAPEKEIPKKEEVKSPV 

KEEEKPQEVKVKEPPKKAEEEKAPATPKTEEKK 

DSKKEEAPKKEAPKPKVEEKKEPAVEKPKESKV 

EAKKEEAEDKXKVPTPEKEAPAKVEVKEDAKPK 

EKTEVAKKEPDDAKAKEPSKPAEKKEAAPEKKD 

TKEEKAKKPEEKPKTEAKAKEDDKTLSKEPSKP 

KAEKAEKSSSTDQKDSKPPEKATEDKAAKGK 

3456 

A 

258 

1463 

YLSFIPGHASKSAPMNGHCFAENGPSQKSSLPPLL 

IPPSENLGPHEEDQVVCGFKKLTVNGVCASTPPL 

TPIKNSPSLFPCAPLCERGSRPLPPLPISEALSLDDT 

DCEVEFLTSSDTDFLLEDSTLSDFKYDVPG\RRSF 

RGCGQINYAYFDTPAVSAADLSYVSDQNG\GVP 

DPNPPPPQTHRRLRRSHSGPAGSFNKPAIRISNCCI 

HRASPNSDEDKPEVPPRVPIPPRPVKPDYRRWSA 

EVTSSTYSDEDRPPKVPPREPLSPSNSRTPSPKSLP 

SYLNGVMPPTQSFAPDPKYVSSKALQRQNSEGS 

ASKVPCDLPIIENGKKVSSTHYYLLPERPPYLDKY 

EKFFREAKKKNGGAQIQPLPADCGISSATEKPDS 

KTKMDLGGHVKRKHLSYVGTP 

3457 

A 

2 

4869 

FILSSSSSASSEHFHHHYSFGNWWPGSFKGHRMS 

LPFYQRCHQHYDLSYRNKDVRSTVSHYQREKKR 

SAVYTQGSTAYSSRSSAAHRRESEAFRRASASSS 

QQQASQHALSSEVSRKAASAYDYGSSHGLTDSS 

LLLDDYSSKLSPKPKRAKHSLLSGEEKENLPSDY 

MVPIFSGRQKHVSGITDTEEERIKEAAAYIAQRNL 

LASEEGITTPKQSTASKQTTASKQSTASKQSTASK 

QSTASRQSTASRQSVVSKQATSALQQEETSEKKS 

KKV VUvOKAERLSLKKl Lbb 1 b 1 YHAKLNbDrlLL 

HAPEFIIKPRSHTVWEKENVKLHCSIAGWPEPRV 

TWYKNQVPINVHANPGKYIIESRYGMHTLEINAC 

DFEDTAQYRASAMNVKGELSAYASVVVKRYKG 

EFDETRFHAG ASTMPLSFG VTP YG YA SRFEIHFD 

DKFDVSFGREGETMSLGCRVV1TPEIKHFQPEIQ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylafanine. G=Glycine. HNHistidine, 
I=lsoleucine, K=Lysine, L?=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





WYRNGVPLSPSKWVQTLWSGERATLTFSHLNKE 

DEGLYTIRVRMGEYYEQYSAYVFVRDADAEIEG 

APAAPLDVKCLEANKDYIIISWKQPAVDGGSPIL 

GYFIDKCEVGTDSWSQCNDTPVKFARFPVTGLIE 

GRSYIFRVRAVNKMGIGFPSRVSEPVAALDPAEK 

ARLKS/PPLSTLDWTWIVTEEEPSEGIVPGPPTDLS 

VTEATRSYVVLSWKPPGQRGHEGIMYFVEKCEA 

GTENWQRVNTELPVKSPRFALFDLAEGKSYCFR 

VRCSNSAGVGEPSEATEVTVVGDKLDIPKAPGKI 

IPSRNTDTSVVVSWEESKDAKELVGYYIEANVA 

GSGKWEPCNNNPVKTHRFTCHGLVTGQSYIFRV 

RAVNAAGLSEYSQDSEAIEVKAAIAPPSPPCD1TC 

LESFRDSMVLGWKQPDKIGGAEITGYYVNYREV 

IDGVPGKWREANVKAVSEEAYKISNLKENMVY 

QFQVAAMNMAGLGAPSAVSECFKCEEWTIAVP 

GPPHSLKCSEVRKDSLVLQWKPPVHSGRTPVTG 

YFVDLKEAKAKEDQWRGLNEAAIKNVYLKVRG 

LKEGVSYVFRVRAINQAGVGKPSDLAGPVVAET 

RPGTKEVVVNVDDDGVISLNFECDKMTPKSEFS 

WSKDYVSTEDSPRLEVESKGNKTKMTFKDLGM 

DDLGIYSCDVTDTDGIASSYLIDEEELICRLLALSH 

EHKFPTVPVKSELAVEILEKGQVRFXWMQAEKLS 

GNAKYNYCFNEKGIFEGPKYKMHIDRNTGIIEMF 

MEKLQDEDEGTYTFQLQDGKATNHSTVVLVGD 

VFKKLQKEAEFQRQEWIRKQGPHFVEYLSWEVT 

GECNVLLKCKVANIKKETHIVWYKDEREISVDE 

KHDFKDGICTLLITEFSKKDAGIYEVILKDDRGK 

DKSRLKLVDEAFKELMMEVCKKIALSATDLKIQ 

STAEGIQLYSFVTYYVEDLKVNWSHNGSAIRYSD 

RVKTGVTGEQIWLQINEPTPNDKGKYVMELFDG 

KTGHQKTVDLSGQAYDEAYAEFQRLKQAAIAEK 

NRARVLGGLPDVVTIQEGKALNLTCNVWGDPPP 

EVSWLKNEKALASDDHCNLKFEAGRTAYFTING 

VSTADSGKYGLWKNKYGSETSDFTVSVFIPEEE 

ARMAALESLKGGKKAK 

3458 

A . 

3963 

827 

LSRSSSDNNTNTLGRNVMSTATSPLMGAQSFPNL 

TTPGTTSTVTMSTSSVTSSSNVATATTVLSVGQS 

LSNTLTTSLTSTSSESDTGQEAEYSLYDFLDSCRA 

STLLAELDDDEDLPEPDEEDDENEDDNQEDQEY 

EEVMBLRRPSLQRRAGSRSDVTHHAVTSQLPQVP 

AGAGSRPIGEQEEEEYETKGGRRRTWDDDYVLK 

RQFSALVPAFDPRPGRTNVQQTTDLEIPPPGTPHS 

ELLEEVECTPSPRLALTLKVTGLGTTREVELPLTN 

FRSTIFYYYQKLLQLSCNGNVKSDKLRRJWEPTY 

TIMYREMKDSDKEKENGKMGCWSIEHVEQYLG 

TDELPKNDLITYLQKNADAAFLRHWKLTGTNKS 

IRKNRNCSQLIAAYWDLG\EHGTK\SGLNQGAIST 

LQSSDILNLTKEQPQAKAGNGQNSCGVEDVLQL 

LRILYIVASDPYSRISQEDGDEQPQFTFPPDEFTS/ 

KKITTKILQQIEEPLALASGALPDWCEQLTSKCPF 

LIPFETRQLYFTCTAFGASRAIVWLQNRREATVE 

RTRTTSSVRRDDPGEFRVGRLKHERVKVPRGESL 

MEWAE^MQIHADRKSVLEVEFLGEEGTGLGPT 

LEFYALVAAEFQRTDLGAWLCDDNFPDDESRHV 

DLGGGLKPPGYYVQRSCGLFTAPFPQDSDELERI 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=A fa nine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
l=Iso!cucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparaginc, P=Prolinc, Q=Glutaminc, R^Arginine, S=Serine, 
T-Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 





TKLFHFLGIFLAKCIQDNRLVDLP1SKPFFKLMCM 

GDIKSNMSKLIYESRGDRDLHCTESQSEASTEEG 

HDSLSVGSFEEDSKSEFELDPPKPKPPAWFNGILT 

WEDFELVNPHRARPLKEIKDLAIKRRQILSNKGL 

SEDEKNTKLQELVLKNPSGSGPPLSIEDLGLNFQF 

CPSSRIYGFTA VDLKP SGEDEMITMDN AEE Y V DL 

MFDFCMHTGlQKQMEAFRDGFNKVFPMEKLSSF 

SHEEVQMILCGNQSPSWAAEDIINYTEPKLGYTR 

DSPGFLRFVRVLCGMSSDERJKAFLQFTTGCSTLP 

PGGLANLHPRLTWRKVDATDASYPSVNTCVHY 

LKLPEYSSEEIMRERLLAATMEKGFHLN 

3459 

A 

88 

603 

SCGPRGLASLGLGFSGRCDDQNKGRSVDGPEAQA 

EACSGERTYQELLVNQNPIAQPLASRRLTRKLYK 

CIKKAVKQKQIRRGVKEVQKFVNKGEKGIMVLA 

GDTLPIEVYCHLPVMCEDRNLPYVYIPSKTDLGA 

AAGSKRFTCVIMVKPHEEYQEAYDECLEEVQSL 

PLPL 

3460 

A 

139 

1997 

QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDVVAPKPPIEPEEEK 

TLKKDEEN\DSKAPPHELTEEEKQQ1LH S EEFLS FF 

DHSTRIVERALSEQINIFFDYSGRDF/ENDfCEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLVVGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

NVVGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

GSVYTACRHGSKAGISEMFEGHQGPITGIHCHAA 

VGAVDFSHLYVTSSFDWTVKLWTTKNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA 

3461 

A 

139 

1997 

QVTNMSDKSELKAELERKKQRLAQIREEKKRKE 

EERKKKETDQKKEAVAPVQEESDLEKKRREAEA 

LLQSMGLTPESPIVPPPMSPSSKSVSTPSEAGSQD 

SGDGAVGSRRGPIKLGMAKITQVDFPPREIVTYT 

KETQTPVMAQPKEDEEEDDDWAPKPPIEPEEEK 

TLKKDEE^ADSKAPPHELTEEEKQQILHSEEFLSFF 

DHSTRIVERALSEQIN1FFDYSGRDF/ENDKEGEIQ 

AGAKLSLNRQFF\DER\WSKASGWVSCLDWSSQ 

YP\ELLVASYNNNEDAPHEPDGVALVWNMKYK 

KTTPEYVFHCQSAVMSATFAKFHPNLWGGTYS 

GQIVLWDNRSNKRTPVQRTPLSAAAHTHPVYCV 

mrVGTQNAHNLISISTDGKICSWSLDMLSHPQDS 

MELVHKQSKAVAVTSMSFPVGDVNNFVVGSEE 

ObV Y I ACKJlUbKJ\ulobMrbOriv^CjPl 1 ulHCHAA 

VGAVDFSHLYVTSSFDWTVKLWTTXNNKPLYSF 

EDNAGYVYDVMWSPTHPALFACVDGMGRLDL 

WNLNNDTEVPTASISVEGNPALNRVRWTHSGRE 

IAVGDSEGQIVIYDVGEQIAVPRNDEWARFGRTL 

AEINANRADAEEEAATRIPA J 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucinc, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 

3462 

A 

2 

2643 

TAPEFSRSTHASAHASVARVLRNREIAQLKKEQR 

RQEFQIRALESQKRQQEMVLRRKTQEVSALRRL 

AKPMSERVAGRAGLKPPMLDSGAEVSASTTSSE 

AESGARSVSSIVRQWNRKINHFLGDHPAPTVNGT 

RPARKKFQKKGASQSFSKAARLKWQSLERRIIDI 

VMQRMTIVNLEADMERLIKKREELFLLQEALRR 

KRERLQAESPEEEKGLQELAEEIEVLAANIDYIND 

GITDCQATIVQLEETKEELDSTDTSVVISSCSLAE 

ARLLLDNFLKASIDKGLQVAQKEAQIRLLEGRLR 

QTDMAGSSQNHLLLDALREKAEAHPELQALIYN 

VQQENGYASTDEEISEFSEGSFSQSFTMKGSTSH 

DDFKFKSEPKLSAQMKAVSAECLGPPLDISTKNI 

TKSLASLVEIKEDGVGFSVRDPYYRDRVSRTVSL 

PTRGSTFPRQSRATETSPLTRRKSYDRGQPIRSTD 

VGFTPPSSPPTRPRNDRNVFSRLTSNQSQGSALD 

KSDDSDSSUSEVLRGIISPVGGAKGARTAPLQCV 

SMAEGHTKPILCLDATDELLFTGSKDRSCKMWN 

LVTGQEIAALKGHPNNVVSIKYCSHSGLVFSVST 

SYIKVWDIRDSAKCIRTLTSSGQVISGDACAATST 

RAITSAQGEHQINQIALSPSGTMLYAASGNAVRI 

WELSRFQPVGK1.TGHIGPVmCLTVTQT^ 

V VTG SKDHY VKMFELGEC VTGT1 GPTHNFEPPH 

YDGIECLAIQGDILFSGSRDNGIKXWDLDQQELIQ 

QIPNAHKDWVCALAFIPGRPMLLSACRAGV1KV 

WNVDNFTPIGEIKGHDSPINAICTNAKH1FTASSG 

CRVKVWNYVPGLTPCLPRRVLAIKGRATTLP 

3463 

A 

198 

3146 

SGEPRPEPGNMATCIGEKJEDFKVGNLLGKGSFA 

GVYRAESIHTGLEVAIKMIDKKAMYKAGMVQR 

VQNEVKIHCQLKHPSILELYNYFEDSNYVYLVLE 

MCHNGEMNRYLKNRVKPFSENEARHFMHQIITG 

MLYLHSHGILHRDLTLSNLLLTRNMNIKIADFGL 

ATQLKMPHEKHYTLCGTPNYISPEIATRSAHGLE 

SDVWSLGCMFYTLLIGRPPFDTDTVKNTLNKVV 

LADYEMPTFLSIEAKDLIHQLLRRNPADRLSLSSV 

LDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAI 

TASSSTSISGSLFDKRRLLIGQPLPNKMTVFPKNK 

SSTDFSSSGDGNSFYTQWGNQETSNSGRGRVIQD 

AEERPHSRYLRRAYSSDRSGTSNSQSQAKTYTM 

ERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 

NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFP 

FADPTPQTETVQQWFGNLQINAHLRKTTEYDSIS 

PNRDFQGHPDLQKDTSKNAWTDTKVKKNSDAS 

DNAHSVKQQNTMKYMTALHSKPEIIQQECVFGS 

DPLSEQSKTRGMEPPWGYQNRTLRS1TSPLVAHR 

LKPIRQKTKKAWSILDSEEVCVELVKEYASQEY 

VKEVLQISSDGNHTIYYPNGG\RGFPLA\DRPPSP 

TVDNISR\YSF\DNLPEKYAVRKYQYASRFVQLVRS 

KSPKITYFTRYAKCILMENSPGADFEVWFYDGV 

KIHKTEDFIQVIEKTGKSYTLKSESEVNSLKEEIK 

MYTV/mi4A"MFnT-lPIPT AT PQnWPPPTfTPQAPFPPTT 
ivi i ivij^iT/YiNrivjriiui^j^/^ 1 Jtvo/vrrrr^li 

IGRKPGSTSSPKALSPPPSVDSNYPTRDRASFNRM 

VMHSAASPTQAPBLNPSMVTNEGLGLTTTASGTD 

ISSNSLKDCLPKSAQLLKSVFVKNVGWATQ\LTS 

GAVWVQFNDGSQLVVQAGVSSISYTSPNGQ\TTR 

\YGENEKLPDYIKQKLQCLSSELLMFSNPTPNFH 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine f D=Aspartic Acid, 
E=Glutamic Acid, F— Phenylalanine, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, , 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\~possible nucleotide insertion 

3464 

A 

14 

348 

AVRTVSGTSLGPRSHSRSPGRCHCFSAVTFSSPRL 
AASEAPDPMEEWDVPQMKKEVESLKYQLAFQR 
EMASKTIPELLKWIEDGIPKDPFLNPDLMKNNPW 
VXEKGKCTIL 

3465 

A 

5537 

405 

VRKLDRERVGAWWRGAWARHPRQEAGEHAKR 

RKGHAETPRGRRKGRAGRSAAAVGELRPARRSL 

ETSRAAAAMAKDSPSPLGASPKKPGCSSPAAAV 

LENQRRELEKLRAELEAERAGWRAERRRFAARE 

RQLREEAERERRQLADRLRSKWEAQRSRELRQL 

QEEMQREREAEIRQLLRWKEAEQRQLQQLLHRE 

RDGVVRQARELQRQLAEELVNRGHCSRPGASEV 

SAAQCRCRLQEVLAQLRWQTDGEQAARIRYLQ 

AALEVERQLFLKYILAHFRGHPALSGSPDPQAVH 

SLEEPLPQTSSGSCHAPKPACQLGSLDSLSAEVG 

VRSRSLGLVSSACSSSPDGLLSTHASSLDCFAPAC 

SRSLDSTRSLPKASKSEERPSSPDTSTPGSRRLSPP 

PSPLPPPPPPSAHRKLSNPRGGEGSESQPCEVLTPS 

PPGLGHHELIKLNWLLAKALWVLARRCYTLQEE 

NKQLRRAGCPYQADEKVKRLKVKRAELTGLAR 

RLADRARELQETNLRAVSAPIPGESCAGLELCQV 

FARQRARDLSEQASAPLAKDKQIEELRQECHLLQ 

ARVASGPCSDLHTGRGGPCTQWLNVRDLDRLQ 

RESQREVLRLQRQLMLQQGNGGAWPEAGGQSA 

TCEEVRRQMLALERELDQRRRECQELGAQAAPA 

RRRGEEAETQLQAALLKNAWLAEENGRLQAKT 

DWVRKVEAENSEVRGHLGRACQERDASGLIAEQ 

LLQQAARGQDRQQQLQRDPQKALCDLHPSWKEI 

QALQCRPGHPPEQPWETSQMPESQVKGSRRPKF 

HARAEDYAVSQPNRDIQEKREASLEESPVALGES 

ASVPQVSETVPASQPLSKKTSSQSNSSSEGSMWA 

TVPSSPTLDRDTASEVDDLEPDSVSLALEMGGSA 

APAAPKLKIFMAQYNYNPFEGPNDHPEGELPLTA 

GDYIYIFGDMDEDGFYEGELEDGRRGLVPSNFVE 

QIPDSYIPGCLPAKSPDLGPSQLPAGQDEALEEDS 

LLSGKAQGVVDRGLCQMVRVGSKTEVATEELDT 

KTEACQLGLLQSMGKQGLSRPLLGTKGVLRMAP 

MQLHLQNVTATSANITWVYSSHRHPHVVYLDD 

REHALTPAGVSCYTFQGLCPGTHYRARVEVRLP 

RDLLQVYWGTMSSTVTFDTLLAGPPYPPLDVLV 

ERHASPGVLVVSWLPVTIDSAGSSNGVQVTGYA 

VYADGLKVCEVADATAGSTLLEFSQLQVPLTWQ 

KVSVRTMSLCGESLDSVPAQIPEDFFMCHRWPET 

PPFSYTCGDPSTYRVTFPVCPQKLSLAPPSAKASP 

HNPGSCGEPQAKFLEAFFEEPPRRQSPVSNLGSE 

GECPSSGAGSQAQELAEAWEGCRKDLLFQKSPQ 

NHRPPSVSDQTGEKENCYQHMGTSKSPAPGFIHL 

RTECGPRJCEPCQEKAALERVLRQKQDAQGFTPP 

QLGASQQYASDFHNVLKEEQEALCLDLWGTERR 

EERREPEPHSRQGQALGVKRGCQLHEPSSALCPA 

PSAKV1KMPRGGPQQLGTGANTPARVFVALSDY 

NPLVMSANLKAAEEELVFQKRQLLRVWGSQDT 

HDFYLSECNRQVGNIPGRLVAEMEVGTEQTDRR 

WRSPAQGHLPSVAHLEDFQGLTEPQGSSLVLQGN 

SKRLPLWTPKIMIAALDYDPGDGQMGGQGKGRL 

ALRAGDVVMVY\GPMDDQGFYYGELGGHRG\L 
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SEQ ID 

NO: 

Method ! 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanine OCysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenyla!anine, G=Glycinc, H=Histidine, 
I^Isoleucine, K=Lysine, L-Leucine, M=Methionine, 
N=Asparagine, P=Proline, QKIIutamine, R=Arginine, S=Scrinc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





VPANLRIKMSSQGH 

3466 

A 

1 

1111 

MSKPPDLLLRLLRGAPRQRVCTLFIIGFKFTFFVSI 

MIYWHVVGEPKEKGQLYNLPAEIPCPTLTPPTPP 

SHGPTPGNIFFLETSDRTNPNFLFMCSVESAARTH 

PESHVLVLMKGLPGGNASLPRHLGISLLSCFPNV 

QMLPLDLRELFRDTPLADWYAAVQGRWEPYLL 

PVLSDASRIALMWKFGGIYLDTDFIVLKNLRNLT 

NVLGTQSRYVLNGAFLAFERRHEFMALCMRDFV 

DHYNGWTWGHQGPQLLTRVFKKWCSIRSLAESR 

ACRGVTTLPPEAFYPIPWQDWKKYFEDINPEELP 

RLLSATYAVHVWNKKSQGTRFEATSRALLAQLH 

ARYCPTTHE/DHENVLVKGPAGHLPNLLLMGHW 

3467 

A 

1 

2175 

MAKVILKQSKQCKNLLTCKVAQVCPVCGCLHC 

YFWWLSGLESRRPSSPLIDIKPIEFGVLSAKKEP1Q 

PSVLRRTYNPDDYFRKFEPHLYSLDSNSDDVDSL 

TDEEILSKYQLGMLHFSTQYDLLHNHLTVRVIEA 

RDLPPPISHDGSRQDMAHSNPYVKICLLPDQKNS 

KQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLLL 

TVVDFDKFSRHCVIGKVSVPLCEVDLVKGGHW 

WKAHDSQFSAPGLPADQQFFADLFSGLVLNPQL 

LGRVWFASQPASLPVGSLCIDFPRLDIVLRGEYG 

NLLEAKQQRLVEGEMLFIPARAANLPVNNKPVM 

LLSLVFAPTWLGLSFYDSRTTSLLHPARQIQLP\SL 

QRGEGEAMLS\ALTLFSRSPLEQNI1QPLVLSLLHL 

CGSVVNMPPGNSQPRGDFLYHSICTWVQDNYAQ 

PLTRESVAQFFNITPNHLSKLFAQHGTMRFIEYVR 

WVRMAKARMILQKYHLSIHEVAQRCGFPDSDYF 

CRVFRRQFGMDYVDILQIHRWDYNTPIEETLEAL 

NDWKAGKARYIGASSMHASQFAQALELQKQH 

GWAQFVSMQDHYNLIYREEEREMLPLCYQEGV 

AVIPWSPLARGRLTRPWGETTARLVSDEVGKNL 

YKESDENDAQIAERLTGVSEELGATRAQVALAW 

LLSKPGIAAPIIGTSREEQLDELLNAVDITLKPEQI 

AELETPYKPHPVVGFK 

3468 

A 

147 

3209 

ALPLPLPTLYPGMSRRKQRKPQQLISDCEGPSASE 

NGDASEEDHPQVCAKCCAQFTDPTEFLAHQNAC 

STDPPVMVIIGGQENPNNSSASSEPRPEGrlNNPQ 

VMDTEHSNPPDSGSSVPTDPTWGPERRGEESSGH 

FLVAATGTAAGGGGGLILASPKLGATPLPPESTP 

APPPPPPPPPPPGVGSGHLNIPLILEELRVLQQRQI 

HQMQMTEQICRQVLLLGSLGQTVGAPASPSELP 

GTGTASSTKPLLPLFSPIKPVQTSKTLASSSSSSSS 

SSGAETPKQAFFHLYHPLGSQHPFSAGGVGRSHK . 

PTPAPSPALPGSTDQLIASPHLAFPSTTGLLAAQC 

LGAARGLEATASPGLLKPKNGSGELSYGEVMGP 

LEKPGGRHKCRFCAKVFGSDSALQIHLRSHTGER 

PYKCNVCGNRFTTRGNLKVHFHRHREKYPHVQ 

MNPHPVPEHLDYVITSSGLPYGMSVPPEKAEEEA 

ATPGGGVERKPLVASTTALSATESLTLLSTSAGT 

ATAPGLPAJ^TsTKFVLMKAVEPKNKADENTPPGSE 

GSAISGVAESSTATRMQLSKLVTSLPSWALLTNH 

FKSTGSFPLPLCARALGVASPSETSKLQQLVEKID 

RQGAVAVTSAASGAPTTSAPAPSSSASSGPNQCV 

ICLRVLSCPRALRLHYGQHGGERPFKCKVCGRAF 

STRGNLRAHFVGHKASPAARAQNSCPICQKKFT 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methioninc, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V= Valine, VV=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 





NAVTLQQHVRMHLGGQIPNGGTALPEGGGAAQ 
ENGSEQSTVSGAGSFPQQQSQQPSPEEELSEEEEE 
EDEEEEEDVTDEDSLAGRGSESGGEKAISVRGDS 
EE A SG AEEE VGTV A A A ATAGKEMDSNEKTTQQS 

Of t~> r") t> o T~> n r"\ c f fa /^v "i ■> /~\ r» i\ /f r? /"> C c 1 /~* "w t xy t -1 i~* /"« is x~\ 

aLFrrrrFuSLDQryPIVL^^ 

ERSSSPASALTPEGEATSVTLVEELSLQEAMRKEP 

GESSSRKACEVCGQAFPSQAAL\EEH\QKTHPKEG 

PLF\TCVFCRQGFLERATLKKHMLLAHHQVQPFA 

PHGPQNIAALSLVPGCSPSITSTGLSPFPRKDDPTI 

P 

3469 

A 

3 

5664 

NLRPLSFALFLGDPNMANLEESFPRGGTRKIHKP 

EKAFQQSVEQDNLFDISTEEGSTKRKKSQKGPAK 

TKKLKIEKRESSKSAREKFEILSVESLCEGMRILG 

CVKEVNELELVISLPNGLQGFVQVTEICDAYTKK 

LNEQVTQEQPLKDLLHLPELFSPGMLVRCVVSSL 

GITDRGKKSVKLSLNPKNVNRVLSAEALKPGML 

LTGTVSSLEDHGYLVDIGVDGTRAFLPLLKAQEY 

IRQKNKGAKLKVGQYLNCIVEKVKGNGGVVSLS 

VGHSEVSTAIATEQQSWNLNNLLPGLVVKAQVQ 

KVTPFGLTLNFLTFFTGVVDFMHLDPKKAGTYFS 

NQAVRACILCVHPRTRVVHLSLRPIFLQPGRPLTR 

LSCQNLGAVLDDVPVQGFFKKAGATFRLKDGVL 

AYARLSHLSDSKNVFNPEAFKPGNTHKCRIIDYS 

QMDELALLSLRTSIIEAQYLRYHDIEPGAVVKGT 

VLTIKSYGMLVKVGEQMRGLVPPMHLADILMK 

NPEKXYHIGDEVKCRVLLCDPEAKKLMMTLKKT 

HESKLPVITCYADAKPGLQTHGFIIRVKDYGCIV \ 

KFYKNTVQGLVPKHELSTEYIPDPERVFYTGQVV 1 

KWVLNCEPSKERMLLSFKLSSDPEPKKEPAGHS 

QKKGKAINIGQLVDVKVLEKTKDGLEVAVLPHN 

IRAFLPTSHLSDHVANGPLLHHWLQAGDILHRVL 

CLSQSEGRVLLCRKPALVSTVEGGQDPKNFSEIH 

PGMLLIGFVKSIKDYGVFIQLPSGLSGLAPKAIMS 

DKFVTSTSDHFVEGQTVAAKVTNVDEEKQRMLL 

SLRLSDCGLGDLA1TSLLLLNQCLEELQGVRSLM 

SNRDSVLIQTLAEMTPGMFLDLVVQEVLEDGSV 

VFSGGPVPDLVLKASRYHRAGQEVESGQKKKVV 

ILNVDLLKLEVHVSLHQVDL VXNRKARXLRKG SE 

HQAIVQHLEKSFAIASLVETGHLAAFSLTSHLND 

TFRFDSEKLQVGQGVSLTLKTTEPGVTGLLLAVE 

GPAAKRTMRPTQKDSETVDEDEEVDPALTVGTI 

KKHTLSIGDMVTGTVKSIKPTHVVVTLEDGIIGCI 

HASHILDDVPEGTSPTTKLKVGKTVTARVIGGRD 

MKTFKYLPISHPRFVRTIPELSVRPSELEDGHTAL 

NTHSVSPMEKIKQYQAGQTVTCFLKKYNVVKK 

WLEX^IAPDIRGRIPLLLTSLSFKVLKHPDKKFRV 

GQALRATVVGPDSSKTFLCLSLTGPHKLEEGEVA 

MGRVVKVTPNEGLTVSFPFGKIGTVSIFHMSDSY 

SETPLEDFVPQKVVRCYILSTADNVLTLSLRSSRT 

MPRT , i^<Jir\/'pnpPPU"CTor»rR r i7r:r^r t d/^ v\/r:Q. \r\vu 

iNfril 1 JVolv v Xj/Ui GlINolV^lJllSJlOv^JLLlvvj I VuM^x Jri 

GVFFRLGPSVVGLARYSHVSQHSPSKKALYNKH 

LPEGKLLTARVLRLNHQKNLVELSFLPGDTGKPD 

VLSASLEGQLTKQEERKTEAEERDQKGEKKNQK 

RNEKKNQKGQEEVEMPSKEKQQPQKPQAQKRG 

GRECRESGSEQERVSKKPKKAGLSEEDDSLVDV 
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Amino acid sequence (A=A!anine C=Cysteine, D=Aspartic Acid, 
E^GIutamic Acid, F=PhenyIalaninc, G=Glycine, H=Histidine, 
J=lsoJeucine, K=Lysinc, LHLcucinc, M=*Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T— Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





YYREGKEEAEETNVLPKEKQTKPAEAPRLQLSSG 

FAWNVGLDSLTPALPPLAESSDSEEDEKPHQAT1 

KKSKKERELEKQKAEKELSRTEEALMDPGRQPE 

SADDFDRLVLSSPNSSILWLQYMAFHLQATEIEK 

ARAVAERALKTISFREEQEKLNVWVALLNLENM 

YGSQESLTKVFERAVQYNEPLKVFLHLADIYAKS 

bisJr (^b AGbL Y N KMLKJKr RQ EKA V W IK Y G AFLLR 

RSQAAASHRVLQRALECLPSKEHVDVIAKFAQL 

EFQLGDAERAKAIFENTLSTYPKRTDVWSVYID 

MT1KHGSQKDVRDIFERVIHLSLAPKRMKFFFKR 

YLDYEKQHGTEKDVQAVKAKALEYVEAKSSVL 

ED 

3470 

A 

2334 

1226 

TAAAPVAPGTMDDATVLRKKG Yl VGINLGKGS Y 

AKVKSAYSERLKFNVAVKIIARKKTPTDFVERFL 

PREMDILATVNHGSI1KTYEIFETSDGRIYIIMELG 

VQGDLLEFIKCQGALHEDVARKMFRQLSSAVKY 

CHDLDIVHRDLKCENLLLDKDFNIKLSDFGFSKR 

CLKDbNGKllLbK 1 FCGSA A Y A APE VLQSIPYQPK 

VYDIWSLGVILYIMVCGSMPYDDSDIRKMLR1QK 

EHRVDFPRSKNLTCECKDLIYRMLQ\PDVS\KRLH 

IDEILSHSWLQPPKPKWTSSASFKREGEGKYRAE 

/~l \S T rVT i l/ r TV"* I O rkI~\T TT> TITVI IT./' T /"* A 1/ T/^I TT» 1 T I fl fn^T 

CKLDTKTGLRPDHRPDHKLGAKTQHRLLVVPEN 
ENRMEDRLAETSRAKDHHISGAEVGKAST 

3471 

A 

537 

148 

TERGAPQHPTLPLPSLTPSSVHTGQPKTTPSVILFL 
PSCEEPQANKATLVCLMNN/FYPGILMVTWKAD 
GTLITQSVEKTTPSKQSNNKYVASSYLSLTPEQW 
RSRRSYSCQVMQEGSTVEKSVAPAECS 

3472 

A 

1 

2272 

DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNHWFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNVVFGLGGELFLWDGEDSSFLVVRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGIKGLMVLELPKRWGKNSEFEGGKST 

WCSTTPVAERFFTSSTSLTLFCHAAWYPSEILDPH 

VVLLTSDNVIRIYSLREPQTPTNVULSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEVVAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHASVAAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCVVLEGEEEDDHTSEKSWDSR 

IDLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKLHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHDLCTKPLP 

CRQPAPFRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSCLQRSVANPAFLKASEKDIAPPPEECLQLLS 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 
QEDIMNRMKKLLHSFHSELPVLSDSERDMKKEL 
QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 
SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

JL/JJVlNxl v in r 

3473 

A 

1 

2272 

DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQT 

WLPNHVVFLRLREGLKNQSPTEAEKPASSSLPSS 

PPPQLLTRNVVFGLGGELFLWDGEDSSFLWRLR 

GPSGGGEEPALSQYQRLLCINPPLFEIYQVLLSPT 

QHHVALIGIKGLMVLELPKRWGKNSEFEGGKST 
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nucleotide 
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corresponding 

to first amino 

acid residue of 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q=GIutamine, R=Argininc, S=Scrinc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





VNCSTTPVAERFFTSSTSLTLKHAAWYPSEILDPH 

VVLLTSDNVIRIYSLREPQTPTNVIILSEAEEESLV 

LNKGRAYTASLGETAVAFDFGPLAAVPKTLFGQ 

NGKDEVVAYPLYILYENGETFLTYISLLHSPGN/I 

WKAVGSIAHAS\AAEDNYGYDACAVLCLPCVPN 

ILVIATESGMLYHCVVLEGEEEDDHTSEKSWDSR 

1DLIPSLYVFECVELELALKLASGEDDPFDSDFSC 

PVKXHRDPKCPSRYHCTHEAGVHSVGLTWIHKL 

HKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLP 

CRQPAPIRGFWIVPDILGPTMICITSTYECLIWPLL 

STVHPASPPLLCTREDVEVAESPLRVLAETPDSFE 

KHIRSILQRSVANPAFLKASEKD1APPPEECLQLLS 

KA I vjvrKJi^YlLKl^DLAKbblQr^ 

KQLEDLSYCREERKSLREMAERLADKYEEAKEK 

QEDIMNTRMKKLLHSFHSELPVLSDSERDMKXEL 

QLIPDQLRHLGNAIKQVTMKKDYQQQKMEKVL 

SLPKPTIILSAYQRKCIQSILKEEGEHIREMVKQIN 

DIRNHVNF 

3474 

A 

4344 

2550 

DRRREPERHVRVKQRTSVLNMLRRLDKIRFRGH 

KRDDFLDLAESPNASDTECSDEIPLKVPRTSPRDS 

EELRDPAGPGTLIMATGVQDFNRTEFDRLNEIKG 

HLEI A LLEKHFLQEELRKLREETNAEMLRQELDR 

ERQRRMELEQKVQEVLKARTEEQMAQQPPKGQ 

AQASNGAERRSQGLSSRLQKWFYERFGEYVEDF 

RFQPEENTVETEEPLSARRLTENMRRLKRGAKPV 

TNFVKNLSALSDWYSVYTSAIAFTVYMNAVWH 

GWAIPLFLFLAILRLSLNYLIARGWRIQWSIVPEV 

SEPVEPPKEDLTVSEKFQLVLDVAQKAQNLFGK 

MADILEKIKNLFMWVQPEITQKLYVALWAAFLA 

SCFFPYRLVGLAVGLYAGIKFFLIDF1FKRCPRLR 

A VVnTDVlIUfDCT DTHDrVT VCDCC A a \ 7 C T> D T ATTC 

AJvYU 1 r Y 11 WKi>Lr 1 L/ryLKbKboAA V SK-KJLQ 1 1 o 

SRSYVPSAPAGLGKEEDAGRFHSTKKGNFHEIFN 

LTENERPLAVCENGWRCCLINRDRKMPTDYIRN 

GVLYVTVENYLCFESSKSGSSKRNKVIKLVDITDI 

QKYKVLSVLPGSGMGIAVSTPSTQKPLVFGAMV 

HRDEAFETILSQYIKITSAAASGGDS 

3475 

A ' 

2 

1126 

TAARRRQKGAAAAAETHGQAKAKSGWLKPYYF 

IELMESRKDITNQEELWKMKPRRNLEEDDYLHK 

DTGETSMLKRPVLLHLHQTAHADEFDCPSELQH 

TQELFPQWHLPIKIAAIIASLTFLYTLLREVIHPLA 

TSHQQYFYKIPILVINKVLPMVSITLLALVYLPGV 

TA A I\/ni VTKinTI^ VT^ \£ FTP14\a/T FiVU/X/TI T*D I/TYEY^T 
1AA1 V V^L-iilNUf 1 iv i JVTvT r o W L,Uiv W JYLL 1 KJv^r uL 

LSFFFAVLHAIYSLSYPMRRSYRYKLLNWAYQQ 

VQQNKEDALVIEHDVWRMEIYVSLGIVGLAJLAL 

LAVTS1PSVSDSLTWREFHY1QSKLGIVSLLLGTIH 

ALIFAWNKWIDIKQFVWYTPPTFMIAVFLPIVVLI 

FKSILFLPCLRKKILKIRHGWEDVTKINKTEICSQL 

3476 

A 

143 

3191 

AKAPPTGESSEPEAKVLHTKRLYRAVVEA VHRL 
DLILCNKTAYQEVFKPENISLRNKLRELCVKLMF 

HSRSTLECAYRTHLVAGIGFYQHLLLYIQSHYQL 

ELQCCIDWTHVTDPLIGCKKPVSASGKEMDWAQ 

MACHRCLVYLGDLSRYQNELAGVDTELLAERFY 

YQALSVAPQIGMPFNQLGTLAGSKYYNVEAMY 

CYLRCIQSEVSFEGAYGNLKRLYDKAAKMYHQL 
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NO: 
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Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E~Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K-Lysine, LHLeucine, M=Methionine» 
N«Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





KKCETRJKLSPGKKRCKDIKRLLVNFlViYLQSLLQ 

PKSSSVDSELTSLCQSVLEDFNLCLFYLPSSPMLS 

LASEDEEEYESGYAFLPDLLIFQMVIICLMCVHSL 

ERAGSKQYSAAIAFTLALFSHLVNHVNIRLQAEL 

EEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPP 

PVTPQVGEGRKSRKFSRLSCLRRRRHPPKVGDDS 

DLSEGFESDSSHDSARASEGSDSGSDKSLEGGGT 

AFDAETDSEMNSQESRSDLEDMEEEEGTRSPTLE 

PPRGRSEAPDSLNGPLGPSEASIASNLQAMSTQM 

FQTKRCFRLAPTFSNLLLQPTTNPHTSASHRPCV 

NGDVDKPSEPASEEGSESEGSESSGRSCRNERSIQ 

EKLQVLMAEGLLPAVKVFLDWLRTNPDLIIVCA 

QSSQSLWNRLSVLLNLLPAAGELQESGLALCPEV 

QDLLEGCELPDLPSSLLLPEDMALRNLPPLRAAH 

RRFNFDTDRPLLSTLEESVVRICCIRSFGHFIARLQ 

GSBLQFNPEVGIFVSIAQSEQESLLQQAQAQFRMA 

QEEARRNRLMRDMAQLRLQLEVSQLEGSLQQPK 

AQSAMSPYLVPDTQALCHHLPV1RQLATSGRFIVI 

IPRTVIDGLDLLKKEHPGARDGIRYLEAEFKKGN 

RY1RCQKEVGKSFERHKLKRQDADAWTLYKILD 

SCKQLTVLAQGAGEEDPSGMVTHTGLPLDNPSVL 

SGPMQAALQAAAHASVDIKNVLDFYKQWKEIG 

3477 

A 

1 

3902 

MTEPRERRGYSVPPRPEVGTQATEWRVEESNFN 

KIFLKKDAELGRSNHLPTWDKPEDASWLPQSCL 

GGDAVATTGEIHEEKAWKTRALEVGQPAQRDIR 

RGELWGKEHGADQAIQETLEDLSSLERTLVVSES 

SPLGGDCQEVTTLTVKYQVSEEVPSGTVIGKLSQ 

ELGREERRRQAGAAFQVLQLPQALPIQVDSEEGL 

LSTGRRLDREQLCRQWDPCLVSFDVLATGDLALI 

HVEIQVLDINDHQPRFPKGEQELEISESASLRTRIP 

LDRALDPDTGPNTLHTYTLSPSEHFALDVIVGPD 

ETKHAELIVVKELDREIHSFFDLVLTAYDNGNPP 

KSGTSLVKVNVLDSNDNSPAFAESSLALEIQEDA 

APGTLLIKLTATDPDQGPNGEVEFFLSKHMPPEW 

LDTFSIDAKTGQVILRRPLDYEKNPAYEVDVQAR 

DLGPNPIPAHCKVLIKVLDVNDNIPSrHVTWASQP 

SLVSEALPKDSFIALVMADDLDSGNNGLVHCWL 

SQELGHFRLKRTNGNTYMLLTNATLDREQWPK 

YTLTLLAQDQGLQPLSAKKQLSIQISDINDNAPVF 

EKSRYEVSTRENNLPSLHLITIKAHDADLGINGK 

VSYRIQDSPVAHLVAIDSNTGEVTAQRSLNYEEM 

AGFEFQVIAEDSGQPMLASSVSVWVSLLDANDN 

APEWQPVLSDGKASLSVLVNASTGHLLVPIETP 

NGLGPAGTDTPPLATHSSRPFLLTTIVARDADSG 

ANGEPLYSIRSGNEAHLFILNPHTGQLFVNVTNA 

SSLIGSEWELEIVVEDQGSPPLQTRALLRVMFVTS 

VDHLRDSARKPGALSMSMLTVICLAVLLGIFGLI 

LALFMSICRTEKKDNRAYNCREAESTYRQQPKR 

PQKfflQKADIHLVPVLRGQAGEPCEVGQSHKDV 

DKEAMMEAGWDPCLQAPrHLTFl LYK I LKMCJG 

NQGAPAESREVLQDTVNLLFNHPRQRNASRENL 

NLPEPQPATGQPRSRPLKVAGSPTGRLAGDQGSE 

EAPQRPPASSATLRRQRHLNGKVSPEKESGPRQI 

LRSLVRLSVAAFAERNPVEELTVDSPPVQQISQLL 

SLLHQGQFQPKPNHRGNKYLAKPGGSRSAIPDTD 
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Method 

Predicted 

beginning 
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corresponding 

to first amino 

acid residue of 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
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peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G)utamic Acid, F=PhenyIalanine, G-Glycine, H=Histtdine, 
1=1 so leu cine, K=Lysine, D=Le urine, M=Methionine, 
N=Asparagine, P=Proline, Q=G!utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\= possible nucleotide insertion 





GPSARAGGOTDPFOFFGPI DPFFDI WKOI T FFF 

LSSLLDPSTGLALDRLSAPDPAWMARLSLPLTTN 

YRDNVISPDAAATEEPRTFQTFGKAEAPELSPTG 

TRLASTFVSEMSSLLEMLLEQRSSMPVEAASEAL 

RRLSVCGRTLSLDLATSAASGMKVQGDPGGKTG 

TEGKSRGSSSSSRCL 

3478 

A 

13 

1620 

TLPPPGNSGCHRLCFPEFEFLQVTKMEFSGRKWR 

KLRLAGDQRNASYPHCLQFYLQPPSENISLrEFEN 

LAIDRVKLLKSVENLGVSYVKGTEQYQSKLESEL 

RKLKJFSYRENLEDEYEPRRRDHISHFILRLAYCQS 

EELRRWFIQQEMDLLRFRFSILPKDKIQDFLKDSQ 

LQFEAISDEEKTLREQEIVASSPSLSGLKLGFESIY 

KIPFADALDLFRGRKVYLEDGFAYVPLKD1VAIIL 

NEFRAKLSKALALTARSLPAVQSDERLQPLLNHL 

SHSYTGQDYSTQGNVGKISLDQIDLLSTKSFPPC 

MRQLHKALRENHHLRHGGRjMQYGLFLKGIGLT 

T FOAT OFWKOFFTKOKMnPnKFni<fnY9YTsJTRl-I<l 

FGKEGKRTDYTPFSCLKI1LSNPPSQGDYHGCPFR 

HSDPELLKQKLQSYKISPGGISQILDLVKGTHYQ 

V\ACQKYFEMIHTVDDCGFS\LSHPNQYFCESQRJ 

LNGGKDIKKEPIQPETPQPKPSVQKTKDASSALA 

SLNSSLEMDMEGLEDYFSEDS 

3479 

A 

•fV 


I JO 

RPFT FT WRT R^PQWPPT nVPRRPURR XIWK'PrPVR 

AQPLSVTVWAPRCQRP/QPPAPEPSSPNAAVPEAI 

PTPRAAASAALELPLGPAPVSVAPQAEAEARSTP 

GPAGSRLGPETFRQRFRQFRYQDAAGPREAFRQL 

REL/SPRQWLRPDI\RTKEQ\IVEMLVQEQLLAILP 

EAARARRIRRRTDVRITG 

3480 

A 

117 

2226 

RRGSRSRGPFAEPAAPGGLCSSSEEKTEEGGMAV 

GLCKAMSQGLVTFRDVALDFSQEEWEWLKPSQ 

KDLYRDVMLENYRNLVWLGLSrSKPNMISLLEQ 

GKEPWMVERKMSQGHCADWESWWE1EELSPK 

WFIDEDEISQEMVMERLASHGLECSSFREAWKY 

KGEFELHQGNAERHFMQVTAVKEISTGKRDNEF 

SN/IWEKHTPEISIFNTTES\PTIQQVHKFDIYDKLF 

PQNSVIIEYKRLHAEKESLIGNECEEFNQSTYLSK 

DIGEPPGEKPYESHDFSKLLSFHSLFTQHQTTHFG 

KLPHGYDECGDAFSCYSFFTQPQRIHSGEKPYAC 

NDCGKAFSHDFFLSEHQRTHIGEKPYECKECNKA 

FRQSAHLAQHQRIHTGEKPFACNECGKAFSRYAF 

LVEHQRIHTGEKPYECKECNKAFRQSAHLNQHQ 

RIHTGEKPYECNQCGKAFSRRIALTLHQRJHTGE 

KPFKCSECGKTFGYRSHLNQHQRIHTGEKPYECI 

SDAL VLIHHKRSHAGEKPYECNKCGKAFSCG SY 

LNQHQRIHTGEKPYECSECGKAFHQILSLRLHQRI 

HAGEKPYKCNESQRVRRSELAVSRGLTTKPADT 

GPDSTLNAAKVAEPARAGTEAALRPALSVAESA 

TSLGPLHQGRRFPEAPAAHPGGTGFTVCAS 

3481 

A 

2 

1522 

ASRHGMTPGALLMLLGALGPPLAPGVRGSEAEG 
RLREKLFSGYDSSVRPAREVGDRVRVSVGL1LAQ 
LISLNEKDEEMSTKVYLDLEWTDYRLSWDPAEH 
DGIDSLRITAESVWLPDVVLLNNNDGNFDVALDI 
SVWSSDGSVRWQPPGIYRSSCSIQVTYFPFDWQ 
NCTMVFSSYSYDSSEVSLQTGLGPDGQGHQEIH1 
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NO: 
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nucleotide 

location 

corresponding 

to first amino 

acid residue of 
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sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine T 
I=IsoIeucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop eodon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





HEGTFIENGQWENIHKPSRLIQPPGDPRGGREGQ 

RQEVIFYLIIRRKPLFYLVNVIAPCIL1TLLAIFVFY 

LPPDAGEKMGLSIFALLTLTVFLLLLADKVPETSL 

S VPIIIKYLMr !M VL V 1 r o V LLo V V V LNLHriKb r H 

THQMPLWVRQIFIHKLPLYLRLKRPKPERDLMPE 

PPHCSSPGSGWGRGTDEYFIRKPPSDFLFPKPNRF 

QPELSAPDLRRFIDGPNRAVALLPELREVVSS1SYI 

ARQLQEQEDHDALKEDWQFVAMVVDRLFLWTF 

IIFTSVGTLWIFLDATYHLPPPDPFP 

3482 

A 

1273 

172 

ERWDSGGADAEWYALADWTAVWLPRSDFYTR 

LQTGEGHVPALRLPAGMPPDSPRELVPKQAPCSP 

SDPALPWTLGHGNQPPAWPEPQGPMGPAGVAA 

RPGRFFGVYLLYCLNPRYRVRWYVGFTVNTARR 

VQQHNGGRKKGGA\GRTSGRGPWEMVLVVHGF 

PSSVAALRFEWAWQHPHASRRLAHVGPRLRGET 

A F A FHLRVL AHMLRAPP W A RLPLTLR W V RPDLR 

QDLCLPPPPHVLLAFGPPPAQVPRPQRRRAGPFD 

DAEPEPDQGDPGACCSLCAQTIQDEEGPLCCPHP 

GCLLRAHVICLAEEFLQEEPGQLLPLEGQCPCCE 

KSLLWGDLIWLCQMDTEKEVEDSELEEAHWTD 

LLET 

3483 

A 

230 

3686 

WRPWPCIDTSWNLQVAARTLRVSSAQCGLVPT 

MARVESPVPAARASLTGSCVLGQAMPLRGGAGP 

SPASHGPTHGPSDPRTCLPGRGAGGMRPHGRGA 

LGCCGLCSFYTCHGAAGDEIMHQDIVPLCAADIQ 

DQLKKRFAYLSGGRGQDGSPV1TFPDYPAFSEIPD 

KEFQNVMTYLTSIPSLQDAGIGFILVIDRRRDKW 

TSVKASVLRIAASFPANLQLVLVLRPTGFFQRTLS 

DIAFKFNRDDFKMKVPVIMLSSVPDLHGYIDKSQ 

LTEDLGGTLDYCHSRWLCQRTA1ESFALMVKQT 

AQMLQSFGTELAETELPNDVQS1ASSVLCAHTEK 

KDKAKEDLRLALKEGHSVLESLRELQAEGSEPSV 

NQDQLDNQATVQRLLAQLNETEAAFDEFWAKH 

QQKLEQCLQLRHFEQGFREVKAILDAASQKIATF 

TDIGNSLAHVEHLLRDLANFQEKSGVFVERARA 

LSLTASSFIGNKHYAVDSIRPKCQELRHLCDQFSA 

EIARRRGLLSKSLELHRRLETSMKWCDEGIYLLA 

SQPVDKCQSQDGAEAALQEIEKFLETGAENKIQE 

LNAIYKEYESDLNQDLMEHVRKVFQKQASMEEV 

FHOIRQASLKKLAARQTRPVQPVAPRPEALAKSP 

CPSPGIRRGSENS SSEGGALRRGP YRRAKSEMSES 

RQGRGSAGEEEESLAILRRHVMSELLDTERAYVE 

ELLCVLEGYAAEMDNPLMAI^LSTGLHNKKDV 

LFGNMEEIYHFHNRffLRELEWTDCPELVGRCF 

LERMEDFQIYEKYCQNKPRSESLWRQCSDCPFFQ 

ECQRKLDHKLSLDSYLLKPVQRITKYQLLLKEM 

LKYSRNCEGAEDLQEALSSILGILKAVNDSMHLI 

AITGYDGNLGDLGKLLMQGSFSVWTDHKRGHT 

KVKELARFKPMQRHLFLHEKAVLFCKKREENGE 

r.vcv a dc vewnci xtaaa a \/nTTPMWfin a if 

\j Y JbJS^Ar o i o i ISA^oIjINMAA V yjl I CIN V KOIJ/\lSJvr c, 

IWYNAREEVYIVQAPTPEIKAAWVNEIRJCVLTSQ 
LQACREASQHRALEQSQSLPLPAPTSTSPSRGNSR 
N1XKLEERKTDPLSLEGYVSSAPLTKPPEKGKGW 
SKTSHSLEAPEDDGGWSSAEEQINSSDAEEDGGL 
GPKKLVPGKYTVVADHEKGGPDALRVRSGDVV 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Ammo acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
l=Isoleucine, K=Lysine, L^Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q-Giutamine, R=Arginine, S=Serine, 
T=Th reonine, V= Valine, W=Tryptophan, Y=Tyrosine f 
X=Un known, *=Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 





ELVQEGDEGLW 

3484 

A 

208 

6103 

VTMAQQAADKYLYVDKNFINNPLAQADWAAK 

KLVWVPSDKSGFEPASLKEEVGEEAIVELVENGK 

KVKWKDDIQKMNPPKFSKVEDMAELTCLNEAS 

VLHNLKERYYSGLIYTYSGLFCVVINPYKNLPIYS 

EEIVEMYKGKIOUIEMPPHIYAITDTAYRSMMQD 

REDQSILCTGESGAGKTENTKKVIQYLAYVASSH 

KSKKDQGELERQLLQANPILEAFGNAKTVKNDN 

SSRFGKFIRINFDVNGYIVGANIETYLLEKSRAIRQ 

AKEERTFHIFYYLLSGAGEHLKTDLLLEPYNKYR 

FLSNGHVTIPGQQDKDMFQETMEAMRIMGIPEEE 

QMGLLRVISGVLQLGNIVFKKERNTDQASMPDN 

TAAQKVSHLLGINVTDFTRGILTPRIKVGRDYVQ 

KAQTKEQADFAIEALAKATYERMFRWLVLRINK 

ALDKTKRQGASFIGILDIAGFEIFDLNSFEQLCINY 

TNEKLQQLFNHTMFILEQEEYQREGIEWNFEDFG 

LDLQPCIDLIEKPAGPPGILALLDEECWFPKATDK 

SFVEKVMQEQGTHPKFQKPKQLKDKADFCIIHY 

AGK VD YKA DEWLMKNMDPLNDNIATLLHQSSD 

KFVSELWKDVDRIIGLDQVAGMSETALPGAFKT 

RKGMFRTVGQLYKEQLAKLMATLRNTNPNFVR 

CIIPNHEKKAGKLDPHLVLDQLRCNGVLEGIRICR 

QGFPNRVVFQEFRQRYEILTPNSIPKGFMDGKQA 

CVLM1KALELDSNLYRIGQSKVFFRAGVLAHLEE 

ERDLKITDVIIGFQACCRGYLARKAFAKRQQQLT 

AMKVLQRNCAAYLKLRNWQWWRLFTKVKPLL 

QVSRQEEEMMAKEEELVKVREKQLAAENRLTE 

METLQSQLMAEKLQLQEQLQAETELCAEAEELR 

ARLTAK\KQ\ELEEICHDLEARVEEEEERCQHLQA 

EKKKMQQNIQELEEQLEEEESARQKLQLEKVTT 

EAKLKKLEEEQIILEDQNCKLAKEKKLLEDRIAEF 

TTNLTEEEEKSKSLAKLKNKHEAMITDLEERLRR 

EEKQRQELEKTRRKLEGDSTDLSDQIAELQAQMA 

ELKMQLAKKEEELQAALARVEEEAAQKNMALK 

KIRELESQISELQEDLKCER\ASKNKAEKQKRDLG 

EELEALKTELEDTLDSTAAQQELRSKREQEVNEL 

KKTLEEEAKTHEAQIQEMRQKHSQAVEELAEQL 

EQTKRVKANLEKAKQTLENERGELANEVKVLLQ 

GKGDSEHKRKKVEAQLQELQVKFNEGERVRTEL 

ADKVTKLQVELDNVTGLLSQSDSKSSKLTKDFS 

ALESQLQDTQELLQEENRQKLSLSTKLKQVEDE 

KNSXFREQLEEEEEEAKHNLEKQIATLHAQVADM 

KKKMEDSVGCLETAEEVKEKLQKDLEGLSQRHE 

EKVAAYDKLEKTKTRLQQELDDLLVDLDHQRQ 

SACNLEKKQKKFDQLLAEEKTISAKYAEERDRA 

EAEAREKETKALSLARALEEAMEQKAELERLNK 

QFRTEMEDLMSSKDDVGKSVHELEKSKRAIEQQ 

VEEMKTQLEELEDELQATEDAKLRLEVNLQAM 

KAQFERDLQGRDEQSEEKKKQLVRQVREMEAE 

T CnCDVnD QAA A \f A A D VVT fcTVT "C A TJTr^C A 

LbLifcKJvC^KoMA V AAKJ<UvLbJVUJLlsX)LbAHlUo A 

NKNRDEAIKQLRKLQAQMKDCMRELDDTRASR 
EEILAQAKENEKKLKSMEAEMIQLQEELAAAER 
AKRQAQQERDELADEIANSSGKGALALEEKRRL 
EARIAQLEEELEEEQGNTELINDRLKKANLQIDQI 
NTDLNLERSHAQKNENARQQLERQNKELKVKL 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenytalanine, G=Glycine, HHHistidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R-Arginine, S=Serme, 
T=Threoninc, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





QEMEGTVKSKYKj^SITALEAKLAQLEEQLDNETK 

ERQAACKQVRRTEKKLKDVLLQVDDERRNAEQ 

YKDQADKASTRLKQLKRQLEEAEEEAQRANASR 

RXLQRELEDATETADAMNREVSSLKNKLRRGDL 

PFVVPRRMARKGAGDGSDEEVDGKADGAEAKP 

AE 

3485 

A 

2 

1782 

CSTG V SKAPLTYLMS YGFELG WRKGNRA V ACR 

EDRGGESVGMGQESILSQVHWWEAEPVEKTPGR 

DSEAT1MSLRVHTLPTLLGAVVRPGCRELLCLLM 

ITVTVGPGASGVCPTACICATDIVSCTNKNLSKVP 

GNLFRLIKRLDLSYNRIGLLDSEWIPVSFAKLNTL 

ILRHNNITSISTGSFSTTPNLKCLDLSSNKLKTVVK 

NAVFQELKVLEVLLLYNNHISYLDPSAFGGLSQL 

QKLYLSGNFLTQFPMDLYVGRFKLAELMFLDVS 

YNRIPSMPMHHINLVPGKQLRGIYLHGNPFVCD\ 

CSLVSLLVFWYRRHFSSVMDFKNDYTCRLWSDS 

RHSRQVLLLQDSFMNCSDSIINGSFRALGFIHEAQ 

VGERLMVHCDSKTGNANTDFIWVGPDNRLLEPD 

KEN1ENFYVFHNGSLVIESPRPEDAGVYSCIAMNK 

QRLLNETVDVTINVSNFTVSRSHAHEAFNTAFTT 

LAACVASIVLVLLYLYLTPCPCKCKTKRQKNML 

HQSNAHSS1LSPGPASDASADERKAGAGKRVVFL 

EPLKDTAAGQNGKVRLFPSEAVIAEGILKSTRGK 

SDSDSVNSVFSDTPFVAST 

3486 

A 

357 

1173 

GDPRETKVFPSRSFARNTVGVSHHQSHLFHTVSR 

IYVEDKHK1LYCEVPKAGCSNWKRILMVLNGLA 

SSAYN1SHNAVHYGKHLKKLDSFDLKGIYTRLDT 

YTK\LVLVRDPMERLVSAFRDKFDHPNSYYHPVF 

GKAIIKKYRPNACEEAL1NGSGVKFKEFIHYLLDS 

HRPVGMDIHWEKVSKLCYPCLINYDFVGKFETL 

EEDANYFLQM1GAPKELKFPNFKDRHSSDERTNA 

QVVRQYLKJDLTRTERQLIYDFYYLDYLMFNYTT 

PFL 

3487 

A 

2 

3281 

CDKSGAVPFSTTRSPRRPSPRSAGPSLSSVSPRSQ 

LWASSGLSEEHAAPLLPAWPRHPCPPSLTPGPSM 

AQGAMRFCSEGDCAISPPRCPRRWLPEGPVPQSP 

PASMYGSTGSLLRRVAGPGPRGRELGRVTAPCTP 

LRGPPSPRVAPSPWAPSSPTGQPPPGAQSSWIFR 

FVEKASVRPLNGLPAPGGLSRSWDLGGVSPPRPT 

PALGPGSNRKLRLEASTSDPLPARGGSALPGSRN 

LVHGPPAPPQVGADGLYSSLPNGLGDPPERLATL 

FGGPADTGFLNQGDTWSSPREVSSHAQRIARAK 

WEFFYGSLDPPSSGAKPPEQAPPSPPGVGSRQGS 

GVAVGRAAKYSETDLDTVPLRCYRETDIDEVLA 

EREEADSAIESQPSSEGPPGTAYPPAPRPGPLPGP 

HPSLGSGNEDEDDDEAGGEEDVDDEVFEASEGA 

RPGSRMPLKSPVPFLPGTSPSADGPDSFSCVFEAI 

LESHRAKGTSYTSLASLEALASPGPTQSPFFTFEL 

PPQPPAPRPDPPAPAPLAPLEPDSGTSSAADGPWT 

l^KGbcbfc A b. AKA IsX Ar uKbr r or OiloiiJJoJLAjL»vjA 

APLGSEPPLSQLVSDSDSELDSTERLALGSTDTLS 

NGQKADLEAAQRLAKRLYRLDGFRKADVARHL 

GKWJDFSKLVAGEYLKFFVFTGMTLDQALRVFL 

KELALMGETQERERVLAHFSQRYFQCNPEALSSE 

DGAHTLTCALMLLNTDLHGHN1GKRMTCGDFIG 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M~ Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





NLEGLNDGGDFPRELLKALYSSIKNEKLQWA IDE 

EELRRFLSELADPNPKVIKRISGGSGSGSSPFLDLT 

PEPGAAVYKHGALVRKVHADPDCRKTPRGKRG 

WKSFHGILKGMJDLYLQKEEYKPGKALSETELKN 

AISIHHALATRASXNYSKRPHVFYLRTADWRVFL 

FQAPSLEQMQSWITRINVVAAMFSAPPFPAAVSS 

QKKFSRPLLPSAATRLSQEEQVRTHEAKLKAMA 

SELREHRAAQLGKKGRGKEAEEQRQKEAYLEFE 

KSRYSTYAALLRVKLKAGSEELDAVEAALAQAG 

STEDGLPPSHSSPSLQPKPSSQPRAQRHSSEPRPG 

AGSGRRKP 

3488 

A 

441 

1968 

GTETPHCWGRGTAGLRRELDREERDGPGTATMS 

FPHFGHPYRGAFQFLVASASSSTTCCESTLRSVSY 

VASGSTPAPALCCAP\YDSRLLGSARPELGAALGI 

YGAPYAAAAAAQSYPGYLPYSPEPPSLYGALNP 

QYEFKEAAGSFTSSLAQPGAYYPYERTLGQYQY 

ERYGA VELSGAGRRKNATRETTSTLKA WLNEHR 

KNPYPTKGEKIMLAIITKMTLTQVSTWFANARRR 

LKKENKMTWAPKNKGGEERKAEGGEEDSLGCL 

TADTKEVTASQEARGLRLSDLEDLEEEEEEEEEA 

EDEEVVATAGDRLTEFRKGAQSLPGPCAAAREG 

RLERRECGLAAPRFSFNDPSGSEEADFLSAETGSP 

RLTMHYPCLEKPRIWSLAHTATASAVEGAPPARP 

RPRSPECRMIPGQPPASARRLSVPRDSACDESSCI 

PKAFGNPKFALQGLPLNCAPCPRRSEPVVQCQYP 

SGAEGSGPPAALGVSMQKTPTYRPARQLHTLCH 

SSLP 

3489 

A 

718 

2073 

UAYHKALSYRGHVHANNRGTNNVHFTPPPSPS 

RGILPMNPRNMMNHSQVGQGIGIPSRTNSMSSSG 

LGSPNRSSPSIICMPKQQPSRQPFTVNSMSGFGMN 

RNQAFGMNNSLSSNIFNGTDGSENVTGLDLSDFP 

ALADRNRREGSGNPTPLINPLAGRAPYVGMVTK 

PANEQSQDFSIHNEDFPALPGSSYKDPTSSNDDSK 

SNLNTSGKTTSSTDGPKFPGDKSSTTQNNNQQKK 

GIQVLPDGRVTNIPQGMVTDQFGMIGLLTFIRAA 

ETDPGMVHLALGSDLTTLGLNLNSPENLYPKFAS 

PWASSPCRPQDIDFHVPSEYLTNIHIRDKLFFFFS 

W/TADCLGRYGEDLLFYLYYMNGGDVLQLLAAV 

ELFNRDWRYHKEERVWITRAPGMEPTMKTNTY 

ERGTYYFFDCLNWRKVAKEFHLEYDKLEERPHL 

PSTFNYNPAQQAF 

3490 

A 

2 

2833 

FVAKMATSQYFDFAQGGGPQYSTQAPTLPLPTV 

GASYTGQPTPGMDPAVNPAFPPAAPAGYGGYQP 

HSGQDFAYGSRPQEPVPTATTMATYQDSYSYGQ 

SAAARSYEDRPYFQSAALQSGRMTAADSGQPGT 

QEACGQPSPHGSHSHAQPPQQAPIVESGQPASTL 

SSGYTYPTATGVQPESSASIVTSYPPPSYNPTCTA 

YTAPSYPNYDASVYSAASPFYPPAQPPPPPGPPQ 

QLPPPPAPAGSGSSPRADSKPPLPSKLPRPKAGPR 

QLQLHYCDICKISCAGPQTYREHLGGQKHRKKE 

AAQKTGVQPNGSPRGVQAQLHCDLCAVSCTGA 

DAYAAHIRGSKHQKVFKLHAKLGKPIPTLEPALA 

TESPPGAEAKPTSPTGPSVCASSRPALAKRPVASK 

ALCEGPPEPQAAGCRPQWGKPAQPKLEGPGAPT 

QGGSKEAPAGCSDAQPVGPEYVEEVFSDEGRVL 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=A1anine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanme, G=Glycine, H=Histidine, 
I-Isoieucine, K=Lysine, LHLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





RPHCKLCECSFNDLNAKDLHVRGRRHRLQYRKK 

VNPDLPL\TEPSSRARKVLEERMRKQRHLAEERL 

EQLRRWHAERRRLEEEPPQDVPPHAPPDWAQPL 

LMGRPESPASAPLQPGRRPASSDDRHVMCKHAT1 

YPTEQELLAVQRAVSHAERALKLVSDTLAEEDR 

GRREEEG DKRS S V APQTR V LKG VMR VG IL A KGL 

LLRGDRNVRLALLCSEBCPTHSLLRR1AQQLPRQL 

QMVTEDEYEVSSDPEANIVISSCEEPRMQVTISVT 

SPLMREDPSTDPGVEEPQADAGDVLSPKKCLESL 

AALRHARWFQARAoGLQPCVlVlRVLRDLCRRV 

PTVWGALPAWAMELLVEKAVSSAAGPLGPGDAV 

RRVLECVATGTLLTDGPGLQDPCERDQTDALEP 

MTLQEREDVTASAQHALRMLAFRQTHKVLGMD 

LLPPRHRLGARFRKRQRGPGEGEEGAGEKKRGR 

RGGEGLV 

3491 

A 

2 

1321 

FVGDGALSGCRRGRAPRVPSMAGSLPPCVVDCG 

TGYTKLGYAGNTEPQFIIPSCIAIRESAKVVDQAQ 

RRVLRGVDDLDFFIGDEAIDKPTYATKWPIRHGII 

EDWDLMERFMEQVVFKYLRAEPEDHYFLMTEP 

PLNTPENREYLAEIMFESFNVPGLYIAVQAVLAL 

AASWTSRQVGERTLTGIV1DSGDGVTHV1PVAEG 

YVIGSCIKHPIAGRDITYFIQQLLREREVGIPPEQS 

LETAKAIKEKYCYICPDIVKEFAKYDVDPRKWIK 

QYTGINA1NQKKFVIDVGYERFLGPEIFFHPEFAN 

PDFMESISDWDEVIQNCPrDVRRPLYKNVVLSG 

GSTMFRDFGRRLQRDLKRVVDARLRLSEELSGGV 

RIKPKPVEVQVVTHHMQRYAV\WFGG\SMLASTP 

EFFQVCHTKKDYEEYGPSICRHNPVFGVMS 

3492 

A 

3 

2024 

PNGVALLHLPGAAVIPNThm4FQDALGGRSRGS 

MESPAPSRAPASASLWRRLVVVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKJGRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYHKTKJ^SEQARSHLEKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 

WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARVVGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKiKMLDCSPILSSFQVILLE 

ffllMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NILM.QLHTLLGLYCVSWCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEVV\LYS 

LLbRlNrDHorr VooHLLKAAAr Y VKGLrarrl^uK 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 

FYVLGNl^SNNMVWAMQLASKIPDMSVQLW 

SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 

LQDHTEACSLPEHNLITWTDGPPPVQFQAQNGPN 

TSLASLL 

3493 

A 

A 

J 

20Z4 

r>XT/^ , \/ ATT T_TT DP A A \/I"DKTTTvTVKAP/*^'r\ A T ClCW} QP/"1C 

REESPAPSRAPASASLWRRLVVVEAKMAAHAAA 

AAQAAAAQAAHAEAADSWYLALLGFAEHFRTS 

SPPKIRLCVHCLQAVFPFKPPQRIEARTHLQLGSV 

LYrfflTKNSEQARSl^EKAWLISQQIPQFEDVKFE 

AASLLSELYCQENSVDAAKPLLRKAIQISQQTPY 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalaninc, G^GIycine, H=Histidine, 
I=Isoleucinc, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S^Serine, 
T^Threoninc, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possiblc nucleotide deletion, 
\-possible nucleotide insertion 





WHCRLLFQLAQLHTLEKDLVSACDLLGVGAEY 

ARVVGSEYTRALFLLSKGMLLLMERKLQEVHPL 

LTLCGQIVENWQGNPIQKESLRVFFLVLQVTHYL 

DAGQVKSVKPCLKQLQQCIQTISTLHDDEILPSNP 

ADLFHWLPKEHMCVLVYLVTVMHSMQAGYLE 

KAQKYTDKALMQLEKLKMLDCSPILSSFQVILLE 

HIIMCRLVTGHKATALQEISQVCQLCQQSPRLFS 

NHAAQLHTLLGLYCVSVNCMDNAEAQFTTALR 

LTNHQELWAFIVTNLASVYIREGNRHQEVVXLYS 

I I FRrMPFjH^FPV^TJPT RA A APWRPJ FCpcrvr^R 

YNEAKRFLRETLKMSNAEDLNRLTACSLVLLGHI 
FYVLGNHRESNNMVVPAMQLASKIPDMSVQLW 
SSALLRDLNKACGNAMDAHEAAQMHQNFSQQL 
LQDHIEACSLPEHNLITWTDGPPPVQFQAQNGPN 
TSLASLL 

3494 

A 

2 

1615 

VLRGQRGPAGGLAEERRRGRNEWRIHDVTTAPF 

PGLVQRRSRLLIVSQVRYFLKNKVSPDLCNEDGL 

TALHQCCIDNFEEIVKLLLSHGANVNAKDNELW 

TPLHA AATCG HINLVKILVQ YG ADLL AVNSDGN 

MPYDLCEDEPTLDVIETCMAYQGITQEKINEN4RV 

APEQQMIADIHCMIAAGQDLDWIDAQGATLLHI 

AGANGYLRAAELLLDHGVRVDVKDWDGWEPL 

HAAAFWGQMQMAELLVSHGANXLNARTSMDE 

MPIDLCEEEEFKVLLLELK\HKHDVIMKSQLRHK 

SSLSRRTSHRQAS/SVGKVVRRTQPVGTGPNL\YR 

KEYE/GEEAILWQRSA\AEDQRTSTYNGDIRET\R 

TnOFMKTiPMPRT FK'VPVT T ^PFPTl^rPRf^PF HN/tPV 
i uk^s^iS ivi^riNjr Jc\JL.CJv\r V L^L/Ocrr 1 JSJ.rivOiil-.lJiVll^ V 

ENGLRAPVSAYQYALANGDVWKVHEVPDYSM 

AYGNPGVADATPPWSSYKEQSPQTLLELKRQRA 

AAKLLSHPFLSTHLGSSMARTGESSSEGKAPLIG 

GRTSPYSSNGTSVYYTVTSGDPPLLKFKAPIEEM 

EEKVHGCCRIS 

3495 

A 

327 

1078 

APMADTTPNGPQGAGAVQFMMTNKLDTAMWL 
SRLFTVYCSALFVLPLLGLHEAASFYQRALLANA 
I T9AT RT RORT PHFOT <5R AFI AOAT I FH^rRVT T 

YSLIF VNS YPVTMSIFPVLLFSLLHA ATYTKKVIA 

DARG\SNSLPLLR\SVLDKLSANQQNILKFIACNEI 

FLMPATVFMLFSGQGSLLQPFIYYRFLTLRYSSRR 

NPYCRTLFNELRIVVEHIIMKPACPLFVRRLCLQS 

1AFISRLAPTVP 

3496 

A 

3 

2867 

SSRTREMEEKEILRRQIRLLQGLIDDYKTLHGNAP 

APGTPAASGWQPPTYHSGRAFSARYPRPSRRGYS 

SrfflGPSWRKKYSLVNRPPGPSDPPADHAVRPLH 

GARGGQPPVPQQHVLERQVQLSQGQNVVIKVKP 

PSKSGSASASGAQRGSLEEFEDTPWSDQRPREGE 

GEPPRGQLQPSRPTRARGTCSVEDPLLVCQKEPG 

KPRMVKSVGSVGDSPREPRRTVSESVIAVKASFP 

SSALPPRTGVALGRKLGSHSVASCAPQLLGDRRV 

DAGHTDQPVPSGSVGGPARPASGPRQAREASLV 

VTCRTNKFRKNNYKWVAASSKSPRVARRALSPR 

VAAENVCKASAGMANKVEKPQLIADPEPKPRKP 

ATSSKPGSAPSKYKWKASSPSASSSSSFRWQSEA 

GSKDHASQLSPVLSRSPSGDXRPALAHSGLKPLSG 

ETPLSAYKVKTRTKIIRRRGSTSLPGDKKSGTSPA 

ATAKSHLSLRRRQALRGKSSPVLKKTPNKGLVQ 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid. F=Phenylalanine, G=Glycine, H=Histidine. 
I=fso)eucine, IC=Lysine, Lr=Leucine, M~Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T^Threoninc, V^Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon,/=possiblc nucleotide deletion, 
\=possib!c nucleotide insertion 





VTKHRLCRLPPSRAm.PTKEASSLHAVRTAPTSK 

VIKTRYR1VKKTPASPLSAPPFPLSLPSWRARRLS 

LSRSL VLNRLRPV ASGGGKAQPGS PW WRSKG YR 

CIGGVLYKVSANKLSKTSGQPSDAGSRPLLRTGR 

LDPAGSCSRSLASRAVQRSLAIIRQARQRREKRK 

EYCMYYNRFGRCNRGERCPYIHDPEKVAVCTRF 

VRGTCKKTDGTCPFSHHVSKEKMPVCSYFLKGI 

CSNSNCPYSHVYVSRKAEVCSDFLKGYCPLGAK 

CKKKHTLLCPDFARRGACPRGAQCQLLHRTQKR 

HSRRAATSPAPGPSDATARSRVSASHGPRKPSAS 

QRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS 

KASSSSSSSSSPPASLDHEVAPSLQEAALAAACSN 

RLCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDSG 

KPLHIKPRL 

3497 

A 

1586 

141 

ATARDLGCARR1DRVVMESTPSRGLNRVHLQCR 

NLQEFLGG LSPG VLDRL YG HPATCL A VFRELPSL 

AKNWVMRMLFLEQPLPQAAVALWVKKEFSKA 

QEESTGLLSGLRIWHTQLLPGGLQGLILNPIFRQN 

LRJ ALLGGGKA WSDDTSQLGPDKHARDVPSLDK 

YAEERWEVVLHFMVGSPSAAVSQDLAQLLSQA 

GLMKSTEPGEPPC1TSAGFQFLLLDTPAQLWYFM 

LQYLQTAQSRGMDLVEILSFLFQLSFSTLGKDYS 

VEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYP 

T/RALAINLSSGVSGAGGTVHQPGFIVWETNYRL 

YAYTESELQ1ALIALFSEMLYPFP\NMVV\ARVTR\ 

ESVQQAIASGITAQQIIHFLRTRAHPVMLKQTPVL 

PPT1TDQIRLWELERDRLRFTEGVLYNQFLSQVDF 

ELL\LAHAPKLGVLVFE/NTPAKRLMVVTPAGHS 

DVKRFWKRQKHSS 

3498 

A 

790 

190 

RDLGPAALMTASASSFSSSQGVQQPSIYSFSQITR 

SLFLSNGVAANDKLLLSSNRITAIVNASVGSGQRI 

LRG\LQYIKVPVTDARDSRLYDFFDPIADLIHTVS 

MRQGRTLLNCMAGVMSRSASLCLAYLMKYHSM 

S\LLDAHTWA/TKSRRPIIRPNNGFWEQLINYEFK 

LF^n^WTVRMINSPVGNIPDIYEKDLRMMISM 

3499 

A . 

31 

1586 

TAGFLLAPLEMQRLLTPVKRILQLTRAVQETSLT 

PARLLPVAHQRFSTASAVPLAKTDTWPKDVG1L 

ALEVYFPAQYVDQTDLEKYNNVEAGKYTVGLG 

QTRMGFCSVQEDINSLCLTVVQRLMER1QLPWD 

SVGRLEVGTET1IDKSKAVKTVLMELFQDSGNTD 

IEGIDTTNACYGGTASLFNAANWMESSSWDGRY 

AMVVCGDIAVYPSGNARPTGGAGAVAMLIGPK 

APLALERGLRGTHMENVYDFYKPNLASEYPIVD 

GKLSIQCYLRALDRCYTSYRKKIQNQWKQAGSD 

RPFTLDDLQYMIFHTPFCKMVQKSLARLMFNDF 

LSASSDTQTSLYKGLEAFGGLKLEDTYTNKDLD 

KALLKASQDMFDKKTKASLYLSTHNGNMYTSSL 

YGCLASLLSHHSAQELAGSRIGAFSYGSGLAASF 

FSFRVSQDAAPGSPL\DKLVSSTSDLPKRLASRKC 

VSPEEFTEIMNQREQFYHKVNFSPPGDTNbLFPGT 

WYLERVDEQHRRKYARRPV 

3500 

A 

185 

2692 

MLPTEVPQSHPGPSALLLLQLLLPPTSAFFPNIWS 
LLAAPGSITHQDLTEEAALNVTLQLFLEQPPPGRP 
PLRLEDFLGRTLL ADDLFAA YFGPG SSRRFRAAL 
GEVSRANAAQDFLPTSRNDPDLHFDAERLGQGR 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Thrconine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





ARLVGALRETVVAARALDHTLARQRLGAALHA 

LQDFYSHSNWVELGEQQPHPHLLWPRQELQNLA 

QVADPTCSDCEELSCPRNWLGFTLLTSGYFGTHP 

PKPPGKCSHGGHFDRSSSQPPRGGINKDSTSPGFS 

PHHMLHLQAAKLALLASIQAFSLLRSRLGDRDFS 

RLLDITPA SSLSF VLDTTGSMGEEINAAKIQARHL 

VEQRRGSPMEPVHYVLVPFHDPGFGPVFTTSDPD 

SFWQQLNEIHALGGGDEPEMCLSALQLALLHTPP 

LSDIFVFTDASPKDAFLTNQVESLTQERRCRVTFL 

VTEDTSRVQGRARRE1LSPLRFEPYKAVALASGG 

EVIFTKDQHIRDVAAIVGESMAALVTLPLDPPVV 

VPGQPLVFSVDGLLQK1TVRIHGDISSFWIKNPAG 

VSQGQEEGGGPLGHTRRFGQFWMVTMDDPPQT 

GTWEIQVTAEDTPGVRVQAQTSLDFLFHFGIPME 

DGPHPGLYPLTQPVAGLQTQLLVEVTGLGSRAN 

pnnpnPHFWVii RriVPFPtApi r.nvpi ppvcppp 

rvJL'ryrn.rijn v ilivvj v r fjVj/\c.L»vJv^ v rLcrVUrrij 

RGLLAASLSPTLLSTPRPFSLELIGQDAAGRRLHR 
AAPQPSTVVPVLLELSGPSGFLAPGSKVPLSLRIA 
SFSGPQDLDLRTFVNPSFSLTSNLSRAHLELNESA 
WGRLWLEVPDSAAPDSVVMVTVTAGGREANPV 
PPTHAFLRLLVSAPAPQDRH 

3501 

A 

1245 

5815 

RRAHPSHSRLSPYLSVSRDPYFFVTVSRTILTLSA 

PAPPRRTPAPSMGTALLQRGGCFLLCLSLLLLGC 

WAELGSGLEFPGAEGQWTRFPKWNACCESEMSF 

QLKTRSARGLVLYFDDEGFCDFLEL1LTRGGRLQ 

LSFSIFCAEPATLLADTPVNDGAWHSVRIRRQFR 

NTTLF1DQVEAKWVEVKSKRRDMTVFSGLFVGG 

LPPELRAAALKLTLASVREREPFKGWIRDVRVNS 

SQVLPVDSGEVKLDDEPPNSGGG\SPCEAGEEGE 

GGVCLNGGVCSVVDDQAVCDCSRTGFRGKDCS 

QEDNNVEGLAHLMMGDQGKEEYIATFKGSEYF 

CYDLSQNPIQSSSDEITLSFKTLQRNGLMLHTGKS 

AD YVNL ALKNG A VSL VINLG SG AFEALVEPVNG 

KFNDNAWHDVKVTRNLRQHSGIGHAMVTISVD 

GILTTTGYTQEDYTMLGSDDFFYVGGSPSTADLP 

GSPVSNNFMGCLKEVVYKNNDVRLELSRLAKQ 

GDPKMKIHGVVAFKCENVATLDPITFETPESFISL 

PKWNAKKTGSISFDFRTTEPNGLILFSHGKPRHQ 

KDAKHPQMIKVDFFAIEMLDGHLYLLLDMGSGT 

IKIKALLKKVNDGEWYHVDFQRDGRSGTISVNT 

LRTPYTAPGESEILDLDDELYLGGLPENKAGLVF 

PTEVWTALLNYGYVGCIRDLFIDGQSKDIRQMA 

EVQSTAGVKPSCSKETAKPCLSNPCKNNGMCRD 

GWNRYVCDCSGTGYLGRSCEREATVLSYDGSM 

FMKIQLPVVMHTEAEDVSLRFRSQRAYGILMAT 

TSRDSADTLRLELDAGRVKLTVNLDCIRINCNSS 

KGPETLFAGYNLNDl^WHTVRVVRRGKSLKLT 

VDDQQAMTGQMAGDHTRLEFHNIETGUTERRY 

LSSVPSNFIGHLQSLTFNGMAYIDLCKNGDIDYC 

ELNARFGFRNIIADPVTFKTKSSYVALATLQAYT 

SMHLFFQFKTTSLDGLILYNSGDGNDFIVVELVK 

GYLHYVFDLGNGANLIKGSSNKPLNDNQWHNV 

MISRDTSNLHTVKIDTKJTTQITAGARNLDLKSDL 

YIGGVAKETYKSLPKLVHAKEGFQGCLASVDLN 

G\RLP\DLISDGSFSCNGTDSRRGMWKGPSTT\CQ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Pheny [alanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





EDSCSNQGVCLQQWDGFSCDCSMTSFSGPLCND 

PGTTYIFSKGGGQITYKWPPNDRPSTRADRLAIGF 

STVQKEAVLVRVDSSSGLGDYLELHIHQGKIGVK 

FNVGTDDIAIEESNAIINDGKYHVVRFTRSGGNA 

TLQVDSWPVBERYPAGRQLTIFNSQATIIIGGKEQ 

GQPFQGQLSGLYYKGLKVLNMAAENDANIAIYG 

bA/KLVGEVPSSMTTESTATAMQS 

TLATSTARRGKPPTKEPISQTTDDILVASAECPSD 

DEDIDPCEPSSGGLANPTRAGGREPYPGSAEVIRE 

SSSTTGMVVGIVAAAALCILILLYAMYKYRNRDE 

OA/T TA /r\r?CD\r\/TOXTC A ACXT/^ A "V T\ J IS T2 \S AT^CP A l/CO 

GSYHVDEbRNYIbNSAQbNUAVVi^ 
NKNKKNKDKEYYV 

3502 

A 

394 

72 

KPAHLPFTVIIMPKRKPSEGAMSDKVKA/KFELQ 
RRSAGLFSKPTPPKPETRPKKDPANQRQKLPKVR 
KGKADA/SKEGNSPAEERCSMVQTQKVEGWRSG 
SELPVALSF 

3503 

A 

43 

3358 

SGGRGPVRVRSEQLSPSAEQVSQ1SQISLGRRPLS 

SLPPPPSRALAPTRAPDTALTIMEVAEVESPLNPS 

CKJMTFRPSMEEFREFNKYLAYMESKGAHRAGL 

AKVIPPKEWKPRQCYDDIDNLLIPAPIQQMVTGQ 

SGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRY 

LDYEDLERKYWKNLTFVAP1YGADINGSIYDEGV 

DEWN1ARLNTVLDVVEEECGISIEGVNTPYLYFG 

MWKTTFAWHTEDMDLYSINYLHFGEPKSWYAIP 

PEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLIS 

PSVLKKYGIPFDKITQEAGEFMITFPYGYHAGFN 

HGmCAESTOTATVRWIDYGKVAKLCTCRKDM 

VKISMDIFVRKFQPDRYQLWKQGKDIYTIDHTKP 

TPASTPEVKAWLQRRRKVRKASRSFQCARSTSK 

RPKADEEEEVSDEVDGAEVPNPDSVTDDLKVSE 

KSEAAVKLRNTEASSEEESSASRMQVEQNLSDHI 

KLSGNSCLSTSVTEDIKTEDDKAYAYRSVPSISSE 

ADDSIPLSTGYEKPEKSDPSELSWPKSPESCSSVA 

ESNGVLTEGEESDVESHGNGLEPGEIPAVPSGER 

NSFKVPSIAEGENKTSKSWRHPLSRPPARSPMTL 

VKQQAPSDEELPEVLSIEEEVEETESWAKPLIHL 

WQTKPPNFAAEQEYNATVARMKPHCAICTLLMP 

YHKPDSSNEENDARWETKLDEWTSEGKTKPLIP 

EMCFIYSEENIEYSPPNAFLEEDGTSLLISCAKCC 

VRVHASCYGIPSHEICDGWLCARCKRNAWTAEC 

CLCNLRGGALKQTKNNKWAHVMCAVAVPEVR 

FTNVPERTQIDVGRIPLQRLKLKCIFCRHRVKRVS 

GACIQCSYGRCPASFHVTCAHAAGVLXMEPDDW 

PYV VMTCFRHKVNPN VKSKACEKVI SVGQTVIT 

KHRNTRYYSCRVMAVTSQTFYEVMFDDGSFSRB 

TFPEDIVSRDCLKLGPPAEGEVVQVKWPDGKLY 

GAKYFGSN1AHMYQVEFEDGSQIAMKREDIYTL 

DEELPKRVKARFVSAGRCHLGTCQVNSLSSPHVS 

QAQQETYLGFWINSKKSQCNIFLSGTY 

J3U4 

A 

A 

1 124 

139 

D PCCAm A CCD D 1? A PT PUPCDT ACCCD T T V> A 1/UD 

KOhbljrDA.br KKr ACLOr CjbKJLQhr bKLLRA V HK 

SRAWTCYLAIRMLMATCCPSPTTTACTGPWQRA 

PPLRLLVQKREADSSGLAFASNSLQRRKKGLLLR 

PVAPLRTRPPLLISLPQDFRQVSSVIDVDLLPETH 

RRVRLHKHGSDRPLGFYIRDGMSVRVAPQG\LER 

VPGIFISRLVRGGLAESTGLLA V SDEILE VNGIEV 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Clutamic Acid, F=PhenylaIanine, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine,\V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 





AGKTLNQVTDMMVANSHNXLIVTVKPANQRNN 
VVRGASGRLTGPPSAGPGPAEPDSDDDSSDLVIE 
NRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPS 
LDDQEQASSGWGSR1RGDGSGFSL 

3505 

A 

3 

2898 

SCRSATSQSGCGGGRSWLCSSLKMAAQPPRGIRL 

SALCPKFLHTNSTSHTWPFSAVAELIDNAYDPDV 

NAKQIWIDKTVINDHICLTFTDNGNGMTSDKLH 

KMLSFGFSDKVTMNGHVPVGLYGNGFKSGSM\R 

LGKDAIVFTKNGESMSVGLLSQTYLVEV1KAEHV 

VVP1VAFNKHRQMINLAESKASLAAILEHSLFSTE 

QKLLAELDAIIGKKGTRJIIWNLRSYKNATEFDFE 

KDKYDIRIPEDLDEITGKKGYKKQERMDQIAPES 

DYSLRAYCSILYLKPRMQnLRGQKVKTQLVSKS 

LAYIERDVYRPKFLSKTVRITFGFNCRNKDHYGI 

MMYHRNRLIKAYEKVGCQLRANNMGVGVVGII 

ECNFLKPTHNKQDFDYTNEYRLTITALGEKLND 

YWNEMKVKKNTEYPLNLPVEDIQKRPDQTWVQ 

CDACLKWRKLPDGMDQLPEKWYCSNNPXDPQFR 

NCEVPEEPEDEDLVHPTYEKTYKKTNKEKFRJRQ 

PEMIPRTNAELLFRPT\ALSTPS\FSSPKESVSKR/RH 

LSEGTNSYATRLLNNHQVPPQSEPESNSLKRRLS 1 

TRSS1LNAKNRRL\SSQF\ENSVYKG\DDDDEDVI1 

LEENSTPKPAVDHDIDMKSEQSHVEQGGVQVEF 

VGDSEPCGQTGSTSTSSSRCDQGNTAATQTEVPS 

LVVKKEETVEDEIDVRNDAVILPSCVEAEAKJHE 

TQETTDKSADDAGCQLQELRNQLLLVTEEKENY 

KRQCHMFTDQIKVLQQRILEMNDKYVKKETCH 

OSTETDAVFLLESINGKSESPDHMVSOYOOAI FF 

IERLKKQCSALQHVICAECSQCSNNESKSEMDEM 

AVQLDDVFRQLDKCSIERDQYKSEVELLEMEKS 

QIRSQCEELKTEVEQLKSTNQQTATDVSTS SNIEE 

SVNHMDGESLKLRSLRVNVGQLLAMIVPDLDLQ 

QVNYDVDWDEILGQWEQMSEISST 

3506 

A 

2 

2120 

RPPEAGGRYRAGGRRQAAKPSRPPLPSRRRLPQG 

GRTRRAMDRPAAAAAAGCEGGGGPNPGPAGGR 

RPPRAAGGATAGSRQPSVETLDSPTGSHVEWCK 

QLIAATISSQISGSVTSENVSRDYKALRDGNKLA 

QMEEAPLFPGESIKAIVKDVMYICPFMGAVSGTL 

TVTDFKLYFKNVERDPHFILDVPLGVISRVEKIGA 

QSHGDNSCGIEIVCKDMRNLRLAYK\QEEQSKLG 

IFENLNKHAFPLSNGQALFAFSYKEKFPINGWKV 

YDPVSEYKRQGLPNESWKISKINSNYEFCDTYPA 

IIVVPTSVKDDDLSKVAVFLAKGRVPVLSWIHPE 

SQATITRCSQPLVGPNDKRCKEDEKYLQTIMDAN 

AQSHKLIDFDARQNSVADTNKTKGGGYESESAYP 

NAELVFLEMNIHVMRESLRKLKEIVYPSIDEARW 

LSNVDGTHWLEYIRMLLAGAVR1ADKEESGKTSV 

VVHCSDGWDRTAQLTSLAMLMLDSYYRTIKGFE 

TLVEKEWISFGHRFALRVGHGNDNHADADRSPTF 

LQFVDCVWQMTRQFPSAFEFNELFL1TILDHLYS 

CLFGTFLCNCEQQRFKEDVYTKTISLWSYINSQL 

DEFSNPFFVNYENHVLYPVASLSHLELWVNYYV 

RWNPRMRPQMPIHQNLKELLAVRAELQKRVEG 

LQREVATRAVSSSSERGSSPSHFATSVHTLV 

3507 

A 

1 

2169 

GSSIKIRLTVLCAKNLAKKDFFRLPDPFVAK1VVD 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, HNHistidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asparagine, P=Proline, Q«Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





GSGQCHSTDTVKNTLDPKWNQHYDLYVGKTDSI 

TISVWNHKKIHKKQGAGFLGCVRLLSNAISRLKD 

TGYQRLDLCKLNPSDTDAVRGQIVVSLQTRDRIG 

TGGSVVDCRGLLENEGTVYEDSGPGRPLSCFME 

EPAPYTDSTGAAAGGGNCRFVESPSQDQRLQAQ 

RLRNPDVRGSLQTPQNRPHGHQSPELPEGYEQRT 

TVQGQVYFLHTQTGVSTWHDPRIPRDLNSVNCD 

ELGPLPPGWEVRSTVSGRIYFVDHNNRTTQFTDP 

RLHHIMNHQCQLKEPSQPLPLPSEGSLEDEELPA 

QRYERDLVQKLKVLRHELSLQQPQAGHCRJEVS 

REEIFEESYRQIMKMRPKDLKKRLMVKFRGEEG 

LDYGGVAREWLYLLCHEMLNPYYGLFQYSTDNI 

YMLQINPDSSINPDHLSYFHFVGRIMGLAVFHGH 

YINGGFTVPFYKQLLGKPIQLSDLESVDPELHKSL 

VWILENDITPVLDHTFCVEHNAFGRILQHELKPN 

(j\KM V 1 bbNKJvh Y VKJL Y VN WKrMKuliJA^rL 

ALQKGFNELIPQHLLKPFDQKELELIIGGLDKIDL 

NDWKSNTRLKHCVADSNIVRWFWQAVETFDEE 

RRARLLQFVTGSTRVPLQGFKALQGSTGVAAGPR 

LFTIHLIDANTDNLRKAHTCFNRID1PPYESYEKL 

YEKLLTAVEETCGFAVE 

3508 

A 

3 

6388 

ILYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKK1IPIPEQSMVQMVCHLLE 

CLLTTED1PADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAKLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKKAGRlsT^GPPGNKKXIYFIDDMNMPEV 

AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTINPRLQRHFSVFVLSFPGAD 

ALSSIYSIILTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTGIKFHYIFNLRDFANIFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALLVGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENnSNVRNEVKSQ 

GLVDNRENCWKFFIDRIRRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEG1EPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 

KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLrN 

FMVPMrHFKTPT V" A TT? P VI HFiPPFKIPPT^V ATK'^YA 
r IN JvXilN irLC/lN OjL>iVrVix\Jr I L^JUrJir IN X JC/P V /\ 1 rvO I r\ 

AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 
DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 
TADKLKCQQEAEVTAVTISLANRLVGGLASENV 
RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 
KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G)utamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Iso!eucine, K^Lysine, L=OLeucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T-Thrconine, V=VaJine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 





LMDDADVAAWQNEGLPADRMSVENATILINCE 

RWPLM VDPQLQG IK WIKN K YG EDLR VTQIGQKG 

YLQUEQALEAGAVVLIENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAVVSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKrNEAREHYRPAAARASLLYFINfNDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYVVGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEVVAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALCYFHAVVAERRKFGPQGWNRSY 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDDWDRRLCRTYLGEFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEKLFRTVLELQPRDSQARDG 

AGATREEKVKALLEEILERVTDEFNIPELMAKVE 

ERTPY1VVAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GIITEAKLKDLTPPMPVMFIKAIPAD\RQDCGHVY 

SCPVTKTSQVRDPTYVWTFNLKTKENPSKWVLA 

GVALLLQI 

3509 

A 

3 

6388 

ILYINPADLGWNPPVSSWIEKREIQTERANLTILF 

DKYLPTCLDTLRTRFKKI1PIPEQSMVQMVCHLLE 

CLLTTEDIPADCPKEIYEHYFVFAAIWAFGGAMV 

QDQLVDYRAEFSKWWLTEFKTVKFPSQGTIFDY 

YIDPETKKFEPWSKLVPQFEFDPEMPLQACLVHT 

SETIRVCYFMERLMARQRPVMLVGTAGTGKSVL 

VGAXLASLDPEAYLVKNVPFNYYTTSAMLQAVL 

EKPLEKKAGRNYGPPGNKJO.IYFIDDMNMPEVD 

AYGTVQPHTIIRQHLDYGHWYDRSKLSLKEITNV 

QYVSCMNPTAGSFTFNPRLQRHFSVFVLSFPGAD 

ALSSIYSIILTQHLKLGNFPASLQKSIPPLIDLALAF 

HQKIATTFLPTG1KFHYIFNLRDFANIFQGILFSSV 

ECVKSTWDLIRLYLHESNRVYRDKMVEEKDFDL 

FDKIQTEVLKKTFDDIEDPVEQTQSPNLYCHFAN 

GIGEPKYMPVQSWELLTQTLVEALENHNEVNTV 

MDLVLFEDAMRHVCHINRILESPRGNALL VGVG 

GSGKQSLTRLAAFISSMDVFQITLRKGYQIQDFK 

MDLASLCLKAGVKNLNTVFLMTDAQVADERFL 

VLINDLLASGEIPDLYSDDEVENIISNVRNEVKSQ 

GLVDNRENCWKFFIDR1RRQLKVTLCFSPVGNKL 

RVRSRKFPAIVNCTAIHWFHEWPQQALESVSLRF 

LQNTEGIEPTVKQSISKFMAFVHTSVNQTSQSYLS 

NEQRYNYTTPKSFLEFIRLYQSLLHRHRKELKCK 

TERLENGLLKLHSTSAQVDDLKAKLAAQEVELK 

QKNEDADKLIQVVGVETDKVSREKAMADEEEQ 
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JL -V»!ULa»7!IC /\ CI U r * * riCji} !.1..J V»— vsIjClUe, « » — JiiSIJUi lit , 

I~Isoleucine, K=Lysine, Lr-'Leucine, M=Methionine, 
N=Asparaginc, P=Proline, Q=Glutaminc, R=Arginine, S^Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





KVAVIMLEVKQKQKDCEEDLAKAEPALTAAQA 

ALNTLNKTNLTELKSFGSPPLAVSNVSAAVMVL 

MAPRGRVPKDRSWKAAKVTMAKVDGFLDSLIN 

FNKENIHENCLKAIRPYLQDPEFNPEFVATKSYA 

AAGLCSWVINIVRFYEVFCDVEPKRQALNKATA 

DLTAAQEKLAAIKAKIAHLNENLAKLTARFEKA 

TADKLKCQQEAEVTAVTISLANRLVGGLASENV 

RWADAVQNFKQQERTLCGDILLITAFISYLGFFT 

KKYRQSLLDRTWRPYLSQLKTPIPVTPALDPLRM 

LMDDADVAAWQNEGLPADRMSVENATILENCE 

RWPLMVDPQLQGIKW1KNKYGEDLRVTQIGQKG 

YLQIIEQALEAGAVVLIENLEESIDPVLGPLLGRE 

VIKKGRFIKIGDKECEYNPKFRLILHTKLANPHYQ 

PELQAQATLINFTVTRDGLEDQLLAAVYSMERP 

DLEQLKSDLTKQQNGFKITLKTLEDSLLSRLSSAS 

GNFLGETVLVENLEITKQTAAEVEKKVQEAKVT 

EVKINEAREHYRPAAARASLLYFIMHDLSKIHPM 

YQFSLKAFSIVFQKAVERAAPDESLRERVANLID 

SITFSVYQYTIRGLFECDKLTYLAQLTFQILLMNR 

EVNAVELDFLLRSPVQTGTASPVEFLSHQAWGA 

VKVLSSMEEFSNLDRDIEGSAKSWKKFVESECPE 

KEKLPQEWKNKTALQRLCMLRAMRPDRMTYAL 

RDFVEEKLGSKYVVGRALDFATSFEESGPATPMF 

FILSPGVDPLKDVESQGRKLGYTFNNQNFHNVSL 

GQGQEVVAEAALDLAAKKGHWVILQNTLEMCS 

RETEFKSILFALC YFH A V V AERRKFGPQG WNRS Y 

PFNTGDLTISVNVLYNFLEANAKVPYDDLRYLFG 

EIMYGGHITDD WDRRLCRTYLG EFIRPEMLEGEL 

SLAPGFPLPGNMDYNGYHQYIDAELPPESPYLYG 

LHPNAEIGFLTQTSEFCLFRTVLELQPRDSQARDG 

AGATREEKVKALLEE1LERVTDEFNIPELMAKVE 

ERTPYTVVAFQECGRMNILTREIQRSLRELELGLK 

GELTMTSHMENLQNALYFDMVPESWARRAYPS 

TAGLAAWFPDLLNRIKELEAWTGDFTMPSTVWL 

TGFFNPQSFLTAIMQSTARKNEWPLDQMALQCD 

MTKKNREEFRSPPREGAYIHGLFMEGACWDTQA 

GUTEAKLKDLTPPMPVMFIKAIPADXRQDCGHVY 

SCPVTKTSQ\RDPTYVWTFNLKTBCENPSKWVLA 

GVALLLQI 

3510 

A 

390 

3330 

AAGSGSRPPAPAARKMADLAECN1KVMCRFRPL 

NESEVNRGDKYIAKFQGEDTVVIASKPYAFDRVF 

QSSTSQEQVYNDCAKKIVKDVLEGYNGTIFAYG 

QTSSGKTHTMEGKLHDPEGMG1IPRIVQDIFNYIY 

SMDENLEFHIKVSYFEIYLDKIRDLLDVSKTNLSV 

HEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKS 

NRHVAVTNMNEHSSRSHSIFLINVKQENTQTEQK 

LSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNIN 

KSLSALGNVISALAEGSTYVPYRDSKMTRILQDS 

LGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTI 

K^WCVNVELTAEQWKKKYEKEKEKNKILRNTI 

QWLENELNRWRNGETVPIDEQFDKEKANLEAFT 

VDKDITLTNDKPATAIGVIGNFTDAERRKCEEEIA 

KLYKQLDDKDEEINQQSQLVEKLKTQMLDQEEL 

LASTRRDQDNMQAELNRLQAENDASKEEVKEV 

LQALEELAVNYDQKSQEVEDKTKEYELLSDELN 
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Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalaninc, G=Glycine, H=Histidine, 
l=Isoleucinc, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosinc, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 





QKSATLAS1DAELQKXKEMTNHQKKRAAEMMA 

SLLKDLAEIGIAVGNNDVKQPEGTGMIDEEFTVA 

RLYISKMKSEVKTKWKRCKQLESTQTESNKKME 

ENEKELAACQLRISQHEAK1KSLTEYLQWEQKK 

RQLEESVDALSEELVQLRAQEKVHEMEKEHLNK 

VQTANEVKQAVEQQIQSHRETHQKQISSLRDEVE 

AKAKLJTDLQDQNQKMMLEQERLRVEHEKLKA 

TDQEKSRKLHELTVMQDRREQARQDLKGLEETV 

AKELQTLHNLRKLFVQDLATRVKKSAEIDSXDDT 

GGSAAQKQK1SFLENNLE\QLTKSAQTSWYRDNA 

DLRCELPKXEKRLRATAERVKALESALKEAKEN 

ASRDRKRYQQEVDRIKEAVRSKNMARRGHSAQI 

AKPIRPGQHPAASPTHPSAIRGGGAFVQNSQPVA 

VRGGGGKQV 

3511 

A 

1 

1757 

MASVQASRRQWCYLCDLPKMPWAMVWDFSEA 

VCRGCVNFEGADRIELLIDAARQLKRSHVLPEGR 

SPGPPALKHPATKDLAAAAAQGPQLPPPQAQPQP 

SGTGGGVSGQDR YDRATSSGRLPLPSPALEYTLG 

SRLANGLGREEAVAEGARRALLGSMPGLMPPGL 

LAAAVSGLGSRGLTLAPGLSPARPLFGSDFEKEK 

QQRNADCLAELNEAMRGRAEEWHGRPKAVREQ 

LLALSACAPFNVRFKKDHGLVGRVFAFDATARP 

PGYEFELKLFTEYPCGSGNVYAGVLAVARQMFH 

DALREPGKALASSGFKYLEYERRHGSGEWRQLG 

ELLTDGVRSFREPAPAEALPQQYPEPAPAALCGP 

PPRAPSRNLAPTPRRRKASPEPEGEAAGKMTTEE 

OOORHWVAPGGPYSAETPGVPSPIAALKNVAEA ! 

LGHSPKDPGGGGGPVRAGGASPAASSTAQPPTQ 

HRLVARNGEAEVSPTAGAEAVSGGGSGTGATPG 

APLC\CTLCRERLEDTHFVQ\CPPVPEHKFCFPCSR 

KFIKAQGPAGEWYCPSGDKCPLVGSSVPWAFMQ 

GEIATILAGDIKVKKERDP 

3512 

A 

3 

1994 

NTNSSSVTNSAAGVEDLNIVQVTVPDNEKERLSS 

IEKIKQLREQVNDLFSRKFGEAIGVDFPVKVPYR 

KITFNPGCVVIDGMPPGVVFKAPGYLEISSMRRIL 

EAAEFIKJTVIRPLPGLELSNGEYSTVGKRKIDQE 

GRVFQEKWERAYFFVEVQNISTCLICKRSMSVSK 

EYNLRRHYQTNHSKHYDQYMERMRDEKLHELK 

KGLRKYLLGLSDTECPEQKQVFANPSPTQKSPVQ 

PVEDLAGNLWEKLREKJRSFVAYSIA1DEITDINN 

TTQLAIFIRGVDENFDVSEELLDTVPMTGTKSGN 

EIFSRVEKSLKNFCINWSKLVSVASTGTPPMVDA 

NNGLVTKLKSRVATFCKGAELKSICCIIHPESLCA 

Q\KLKMDHVMDWVKSVNWICSRGLNHSEFTTL 

LYELDSQYGSLLYYTEIKWLSRGLVLKRFFESLE 

EIDSFMSSRGKPLPQLSSIDWIRDLAFLVDMTMH 

LNALNISLQGHSQIXnTQMYDLIRAFLAKLCLWET 

HLTRNNLAHFPTLKLVSRNESDGLNYTPKIAELK 

TEFQKJRLSDFKLYESELTLFSSPFSTKIDSVHEELQ 

MEVIDLQCNTVLKTKYDKVGIPEFYKYLWGSYP 

KYKHHCAKILSMFGSTYICEQLFSIMKLSKTKYC 

SQLKDSQWDSVLHIAT 

3513 | 

A 

1836 

513 

FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 
DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 
GPGNPLPDRLGEMAGGRHRRWGTLHLLLLVAA 
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nucleotide 
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corresponding 

to first amino 

acid residue of 
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Predicted end 
nucleotide 
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corresponding 
to last amino 
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peptide 
sequence 

Amino acid sequence (A=A!anine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidtne, 
I=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=G1utamine, R=Arginine, S==Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

IAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQ 

HIMQR1QRLQADWVLEIDTFLSQTPYGYRSFSNII 

STLNPTAKRHLVLACHYDSKYFSHWXNNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQAIEHELHELGLLKDHSLEG 

RYFQNYSYGGV1QDDHIPFLRRGVPVLHLIPSPFP 

EVWHTMDDNEENLDESTIDNLNKILQVFA^LEYL 

HL 

3514 

A ! 

1836 

513 

FKSLLSVKWFCFSILVLIFLGTRCYWEMTQSRPSP 

DPHRGRWEGGRSRPKGGEEGRRRTRVPGLVTAS 

GPGNPLPDRLGEMAGGRHRRVVGTLHLLLLVAA 

LPWASRGVSPSASAWPEEKNYHQPAILNSSALRQ 

IAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQ 

H1MQRIQRLQADWVLEIDTFLSQTPYGYRSFSNII 

STLNPTAKRHLVLACHYDSKYFSHW\NNRVFVG 

ATDSAVPCAMMLELARALDKKLLSLKTVSDSKP 

DLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKM 

ASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTF 

PNFFPNSARWFERLQA1EHELHELGLLKDHSLEG 

RYFQN YSYGG VIQDDHIPFLRRGVPVLHLIPSPFP 

E V WHTMDDN EENLDESTIDNLNKILQ VF VLEYL 

HL 

3515 

A 

114 

754 

LCRDLTTTMSSKRTKTKTKKRPQRATSNVFAMF 

DQSQIQEFKEAFNMIDQNRDGFIDKEDLHDMLAS 

LGKNPTDEYLDAMMNEAPGPrNFTMFLTMFGEK 

LNGTDPEDVIRNAFACFDEEATGTIQEDYLRELL 

TRMGDRF\TDE\EVDELYREAP1\DKKGGIFNYI\E 

FTRHLETGGPKDKDDRKITFQIPSPNVPWLATFG 

VFLEIFLLHGP 

3516 

A 

1 

5169 

MAAAPSALLLLPPFPVLSTYRLQSRSRPSAPETDD 

SRVGGIMRGEKNYYFRGAAGDHGSCfTTTSPLA 

SALLMPSEAVSSSWSESGGGLSGGDEEDTRLLQL 

LRTARDPSEAFQALQAALPRRGGRLGFPRRKEAL 

YRALGRVLVEGGSDEKRLCLQLLSDVLRGQGEA 

GQLEEAFSLALLPQLVVSLREENPALRKDALQDL 

HICLKRSPGEVLRTLIQQGLESTDARLRASTALLL 

PILLTTEDLLLGLDLTEVIISLARKLGDQETEEESE 

TAFSALQQIGERLGQDRFQSYISRLPSALRRHYN 

RRLESQFGSQVPYYLELEASGFPEDPLPCAVTLS 

NSNLKFG1IPQELHSRLLDQEDYKNRTQAVEELK 

QVLGKFNPSSTPHSSLVGFISLLYNLLDDSNFKVV 

HGTLEVLHLLV1RLGEQVQQFLGPVIAASVKVLA 

DNKLVIKQEYMKIFLKLMKEVGPQQVLCLLLEH 

LKHKHSRVREEVVNICICSLLTYPSEDFDLPKLSF 

DLAPALVDSKRRVRQAALEAFAVLASSMGSGKT 

SILFKAVDTVELQDNGDGVMNAVQARLARKTLP 

RJLTEQGr V b Y A V LMr oo ALjuKoIN rlLAHOAU I JJ 

WLLAGNRTQSAHCHCGDHVRDSMHIYGSYSPTI 

CimVLSAGKGKNKLPWENEQPGIMGENQTSTS 

KDIEQFSTYDFIPSAKLKLSQGMPVNDDLCFSRK 

RVSRNLFQNSRDFNPDCLPLCAAGTTGTHQTNLS 

GKCAQLGFSQICGKTGSVGSDLQFLGTTSSHQEK 
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nucleotide 
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corresponding 
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Predicted end 
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corresponding 
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peptide 
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Amino acid sequence (A=Alaninc OCysieine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GJutamine, R=Arginine, S=Serine, 
T=Thrconinc, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





VYASLNFGSKTQQTFGSQTECTSSNGQNPSPGAY 

ILPSYPVSSPRTSPKHTSPLI1SPKKSQDNSVNFSNS 

WPLKSFEGLSKPKSHRRSLSAQKSS\DPTGRVNHG 

\ENSQEKPPWQLTPAL\VRSPSSRRGLNGTKPVPPI 

PVRGISLLPDKADLSTVGHKKKEPDDIWKCEKDS 

LPIDLSELNFKDKI)LDQEEMHSSLRSLRNSAAKK 

RAKLSGSTSDLESPDSAMKLDLTMDSPSLSSSPNI 

NSYSESGVYSQESLTSSLSTTPQGKRJMSDIFPTFG 

SKPCPTRLSSAKKKISHIAEQSPSAGSSSNPQQISS 

FDFTTTKALSEDSVVVVGKGVFGSLSSAFATCSQ 

SVISSVENGDTFS1KQSIEPPSG1YGRSVQQNISSYL 

DVENEKDAKVSISKSTYNKMRQKRKEEKELFHN 

KDCEKKEKNSWERMRHTGTEKMASESETPTGAI 

SQYKERMPSVTHSPEIMDLSELRPFSKPEIALTEA 

LRLLADEDWEKKIEGLNFIRCLAAFHSEILNTKL 

HETNFAWQEVKNLRSGVSRAAVVCLSDLFTYL 

KKSMDQELDTTVKVLLHKAGESNTFIREDVDKA 

LRAMVNNVTPARAVVSLINGGQRYYGRKMLFF 

MMCHPNFEKMLEKYVPSKDLPYIKDSVRNLQQK 

GLGEIPLDTPSAKGRRSHTGSVGNTRSSSVSRDA 

FNSAERAVTEVREVTRKSVPRNSLESAEYLKLIT 

GLLNAKDFRDRING1KQLLSDTENNQDLVVGNIV 

K1FDAFKSIU.HDSNSKVT4LVALET^ 

HLSPIFNMLIPAIVDNNLNSKNPGIYAAATNVVQA 

LSQHVDNYLLLQPFCTKAQFLNGKAKQDMTEKL 

ADIVTELYQRKPHATEQKVLVVLWHLLGNMTN 

SGSLPGAGGNIRTATAKLSKALFAQMGQNLLNQ 

AASQPPHIKKSLEELLDMTILNEL 

3517 

A 

1449 

252 

QDLKPVLDREYLAIYLKMVFFTCNACGESVKKI 

QVEKHVSVCRNCECLSCIDCGKDFWGDDYKNH 

VKCISEDQKYGGKGY/EKVKTHKGD/ASKQQAW 

IQKISELIK\RPNVSPKVRELLEQISAFDNVPQ\KK 

AKJFQNWMK^SLKVHNESILDQVWNIFSEASNSE 

PVNKEQDQRPLHPVANPHAEISTKVPASKVKDA 

VEQQGEVKKNKRERXEERQKKJRKREKKELKLE 

NGSAGKRSKKKKQRKDSASEEEARVGAGKRKR 

RHSKVETDSKKKKMKLPEHPEGGEPEDDEAPAK 

GKFNWKGTIKAILKQAPDNEITIKKLRKKVLAQY 

YTVTDEHHRSEEELLVIFT^KKISKNPTFKLLKDK 

VKLVK 

3518 

A 

3 

635 

APDSNARNDHFDACSLRVQAGLSSAGPALGNSG 

L/VALMAoroJVA V I VrulNUOUL) V 1 1 riKj WiuW VK 

KELEK1PGFQCLAKNMPDPITARESIWLPFMETEL 
HCDEKTIIIGHSSGAIAAMRYAETHRVYAIVLVSA 
YTSDLGDENERA SG YFTRP WQ WEKIKANCP YIV 
QFGSTDDPFLPWKEQQEVADNSWKPNCTNSLTV 
ATFRTQSFMN 

3519 

A 

81 

2277 

VRETRREMAMAMSDSGASRLRRQLESGGFEARL 

KRNVYQNYRQFIETAREISYLESEMYQLSHLLTE 

QKSSLESIPLTLLPAAAAAGAAAASGGEEGVGGA 

GGRDHLRGQAGFFSTPGGASRDGSGPGEEGKQR 

TLTTLLEKVEGCRHLLETPGQYLVYNGDLVEYD 

ADHMAQLQRVHGFLMNDCLLVATWLPQRRGM 
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NO: 

Method 

Predicted 
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nucleotide 

location 

corresponding 

to first amino 

acid residue of 
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sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alantne C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnyla!aninc, G=Glycine, H=Histidine, 
I=lsoteucinc, K=Lysinc, L=Leucine, M=Methionine, 
N^Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





YRYNALYSLDGLAVWVKDNPPMKDMFKLLMF 

PENRIFQAENAKIKJIEWLEVLEDTKRALSEKRRR 

EQEEAAAPRGPPQVTSKATNPFEDDEEEEPAVPE 

VEEEKVDLSMEW1QELPEDLDVC1AQRDFEGAV 

DLLDKLNHYLEDKPSPPPVKELRAKVEERVRQL 

TEVLVFELSPDRSLRGGPKATRRAVSQLIRLGQC 

TKACELFLRNRAAAVHTAIRQLRIEGATLLYIHK 

LCHVFFTSLLETAREFEIDFAGTDSGCYSAFVVW 

ARSAMGMFVDAFSKQVFDSKESLSTAAECVKVA 

lSJC-riv^V^V^l^VJl-V10L,iyl^ I r llrxrVJLJL. V tvi-'l^Vj/VL/rto I Iv 

E1IIEATKHRNSEEMWRRMNLMTPEALGKLKEE 

MKSCGVSNFEQYTGDDCWVNLSYTVVAFTKQT 

MGFLEEALKLYFPELHMVLLESLVEIILVAVQHV 

DYSLRCEQDPEKKAFIRQNASFLYETVLXPVVEK 

RFEEGVGKPAKQLQDLRNASRLIRVNPESTTSVV 

3520 

A 

1706 

540 

FVAHLAWPWRADGDMEDGVLNEGFLVKRGHIV 

HNWKARWFILRQNTLVYYKLEGGRRVTPPKGRI 

LLDGCTITCPCLEYENRPLLIKLKTQTSTEYFLEA 

CSREE/RRDAWAFEMTGAIHAGQARGKVQQLHS 

LRNSFKLPPHI SLHRI V DKM HDSNTG IRS SPNMEQ 

GSTYKKTFLGSSLVDWLISNSFTASRLEAVTLAS 

AyfT K/TCRXTtTf DD\/r\/D Q\A(~l A ID O/TM A TirWIV nncT 

lVLL^rvlxiri JN r L,Kr VUV K5 M OA i Ko O V L> Ail K£r LDUo i 

ALYTFAESYKKKISPKEEISLSTVELSGTVVKQGY 

LAKQGHKRXNWKVRRFVLRKDPAFLHYYDPSK 

EENRPVGGFSLRGSLVSALEDNGVPTGVKGNVQ 

GNLFKVITK\DDTHYYIQA\SSKAE\RAE\WIGSLS 

KSLNMNKDPEGTPDSLPSLPR 

3521 

A 

3 

3063 

HASVSLSLGCPRPCADTPGPQPQPMDLRVGQRPP 

VEPPPEPTLLALQRPQRLHHHLFLAGLQQQRSVE 

PMRVKMELPACGATLSLVPSLPAFSIPRHQSQSST 

PCPFLGCRPCPQLSMDTPMPELQEAPQEQELRQL 

LHKDKSKRSAVASSVVKQKLAEVILKXQQAALE 

RTVHPNSPGIPYRTLEPLETEGATRSMLSSFLPPV 

PSLPSDPPEHFPLRKTVSEPNLKLRYKPKKSLERR 

KNPLLRKESAPPSLRRRPAETLGDSSPSSSSTPAS 

GCSSPNDSEHGPNPILGSEALLGQRLRLQETSVAP 

FALPTVSLLPAITLGLPAPARADSDRRTHPTLGPR 

GPILGSPHTPLFLPHGLEPEAGGTLPSRLQPILLLD 

PSGSHAPLLTVPGLGPLPFHFAQSLMTTERLSGSG 

LHWPLSRTRSEPLPPSATAPPPPGPMQPRLEQLKT 

HVQVIKRSAKPSEKPRLRQIPSAEDLETDGGGPG 

QVVDDGLEHRELGHGQPEARGPAPLQQHPQVLL 

WEQQRLAGRLPRGSTGDTVLLPLAQGGHRPLSR 

AQSSPAAPASLSAPEPASQARVLSSSETPARTLPF 

TTGLIYDSVMLKHQCSCGDNSRHPEHAGRIQSIW 

SRLQERGLRSQCECLRGRKASLEELQSVHSERHV 

LLYGTNPLSRLKLDNGKLAGLLAQRMFVMLPCG 

GVGVDTDTIWNELHSSNAARWAAGSVTDLAFK 

VASRELKNGFAVVRPPGHHADHSTAMGFCFFNS 

VAIACROLOOOSKASKILIVDWDVHHGNGTOOT 

FYQDPSVLYISLHRHDDGNFFPGSGAVDEVGAGS 

GEGFNVNVAWAGGLDPPMGDPEYLAAFRIVVM 

PIAREFSPDLVLVSAGFDAAEGHPAPLGGYHVSA 

KCFGYMTQQLMNLAGGAVVLALEGGHDLTAIC 

DASEACVAALLGNRVDPLSEEGWKQKPNLNAIR 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide j 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methtonine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Sertnc, 
T=Threonine, V= Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





SLEAWIRVHSKYWGCMQRLASCPDSWVPRVPG 
ADKEEVEAVTALASLSVGILAEDRPSEQLVEEEE 
PMNL 

3522 

A 

9 

602 

KMAALGEPVRLERDICRATFI I FKf ORSGFVPPO 
KLQALQRVLQSEFCNAVREVYEHVYETVD1SSSP 
EVRANATAKATVAAFAASEGHSHPRVVELPKTE 
EGLGFNIMGGKEQNSPIYISRIIP/GGIADRHGGLK 
RGDQLLSVNGVSVEGEHHEKAVELLKAAQGKV 
KLVVRYTPKVLEEMESRFEKMRSAKRRQQT 

3523 

A 

645 

1465 

IMAETSLLEAGASAASTAAALENLQVEASCSVCL 

EYLKEPVIIECGHNFCKACITRWWEDLERDFPCP 

VCRKTSRYRSLRPNRQLGSMVEIAKQL\RPSSGRS 

fJIVTRASAPORffFAT ST FrVFDOFAVn IP A TST-TTO 

RAHTVVPLDDATQEYKEKLQKCLEAVLNQKLQEI 

TRCKSSEEKKPGELKRLVESRRQQILREFEELHRR 

LDEEQQVLLSRLEEEEQDILQRLRENAAHLGDKR 

RDLAHLAAEVEGKCLQSGFEMLKVRPLPLHSPS 

G 

3524 

A 

3 

698 

PMVRHEAGEALGAIGDPEVLEILKQYSSDPVIEV 

AFTCOI AVRRT FWT OOHfJOPPA AOPVT QVnPAP 

PAEER\DVGRLREALLDESRPLFERYRAMFALRN 

AGGEEAALALAEGLHCGSALFRHEVGYVLGQLQ 

HEAAVPQLAAALARCTENPMVRHECAEALGA1A 

RPACLAALQAHADDPERVVRE\SCKVALDMYEH 

ETGRAFQYADGLEQLRGAPSLGPNPHPELPEDS 

3525 

A 

1452 

694 . 

EGLQRPEYLVASAAGFQGLAWGGEGRGRAGCS 
SSGFRDAEPLLLSCPGRNEPLKKERLKWKSDYP 

A/fTnnOT RSl^PnFPWnTAPAFPriRT^FTWT^ A f V A 
lvi I i^ov^JUlvolSJvL/Crr WL/l /\Jr/\r E,ot\Jtvc»l W UPkLj]\J\ 

AAYAAEANDHELAQAILDGASITLPHGTLCECY 

DELGNRYQLPIYCLSPPVNLLLEHTEEESLEPPEP 

PPSVRREFPLKVRLSTGKDVRLSASLPDTVGQLK 

RQLHAQE/GTPKPSWQRWFFSGKLLTDRTRLQET 

KIQKDFVIQVIINQPPPPQD 

3526 

A 

123 

3441 

PGNEGLGLAADHNEDLGHLSADAPWPAVTMAP 

RKRSHHGLGFLCCFGGSDIPEINLRDNHPLQFME i 

FSSPDPNAEELNIRFAELVDELDLTDKNREAMFAL 

PPEKKWQIYCSKKK£QEDPNKLATSWPDYY1DRI 

NSMAAMQSLYAFDEEETEMRNQVVEDLKTALR 

TQPMRFVTRFIELEGLTCLLNFLRSMDHATCESRI 

HTSLIGC1IALMNNSQGRAHVLAQPEAISTIAQSL 

RTENSKTKVAVLEILGAVCLVPGGHKKVLQAML 

HYQVYAAERTRFQTLLNELDRSLGRYRDEVNLK 

TAIMSFINAVLNAGAGEDNLEFRLHLRYEFLMLG 

IQPVIDKLRQHENAILDKHLDFFEMVRNEDDLEL 

ARRFDMVHIDTKSASQMFELIHKKLKYTEAYPC 

LLSVLHHCLQMPYKRNGGYFQQWQLLDRJLQQ1 

VLQDERGVDPDLAPLENFNVKNIVNML[NENEV 

KQWRDQAEKFRKEHMELVSRLERKERECETKTL 

EKEEMMRTvLNKMKDKLARESQELRQARGQVA 

ELVAQLSELSTGPVSSPPPPGGPLTLSSSMTTNDL 

PPPPPPLPFACCPPPPPPPLPPGGPPTPPGAPPCLG 

MGLPLPQDPYPSSDVPLRKKRVPQPSHPLKSFNW 

VKLNEERVPGTVWNEIDDMQVFRILDLEDFEKM 

FSAYQRHQELITNPSQQKELGSTEDIYLASRKVK 

ELSVIDGRRAQNCIILLSKLKLSNEEIRQAILKMD 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcny!alaninc, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Mcfhionine, 
N=Asparagine, P=ProIine, Q-Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unkno\vn, *=Stop codon, /=possib!e nucleotide deletion, 
\=possib!e nucleotide insertion 





EQEDLAKDMLEQLLKF1PEKSDIDLLEEHKHEIER 

MARADRFLYEMSRIDHYQQRLQALFFKKKFQER 

LAEAKPKVEAILLASRELVRSKRLRQMLEVILAI 

GNFMNKGQRGGAYGFRVASLNKIADTKSSIDRN 

ISLLHYLIMILEKHFPDILNMPSELQHLPEAAKVN 

L-AJbLc-rvJ^ V LjIN L, KJvvj LKA Vr,V r,L,n, i ^KKy V KJbro 

DKFVPVMSDFITVSSFSFSELEDQLNEARDKFAK 
ALMHFGEHDSKMQPDEFFGIFDTFLQAFSEARQD 
LEAMRRRKEEEERRARMEAMLKEQRERERWQR 
QRK VL AAG SSLEEGGEFDDLVS ALRSGEVFDKD 
LCKLKRSRKRSGSQALEVTRERAINRLNY 

3527 

A 

1445 

714 

LLGTRMLAGQLEARDPKEGTHPEDPCPGAGAV 
MEKTAVAAEVLTEDCNTGEMPPLQQQIIRLHQE 

f nD r\v c T wi a t"\\/ij/^*i/'t dot_jtt\at Dcr\\fik jtcx t>t?i/'t 
LOKl^KoL WAU VHOJKXKohQL)AEK^ 

RALQLQRWKARKKSAASPHAGQESHTLALEPAF 

GKISPLSADEETIPKYAGHKNXQSGHSSWGQRSSS 

NNSAPPKPMSLKIERISSWKTPPQENRDKNLSRR 

RQDRRATPTGRPTPCAERRGWSEDGKVASDTCV 

TLHWPLGKFRFR 

3528 

A 

484 

1777 

RJSKIQVYYSTGYSSRKMNPTLGLAIFLAVLLTVK 

GLLKPSFSPRNYKALSEVQGWKQRMAAKELAR 

QNMDLGFKLLKKLAFYNPGRNIFLSPLSISTAFS 

MLCLGAQDSTLDEIKQGFNFRKMPEKDLHEGFH 

YIIHELTQKTQDLKLSIGNTLFIDQRLQPQRKFLE 

DAKNFYSAETILTNFQNLEMAQKQINDFI/ESKTH 

GKIhTNLIENIDPGTVMLLANYIFFRARWKHEFDP 

NVTl^bDFrLEl^SSVKVPMMFKSGlYQVGYD 

KLSCTILEIPYQKNITAIFILPDEGKLKHLEKGLQV 

DTFSRWKTLLSRRVVDVSVPRLHMTGTFDLKKT 

LSYIGVSKIFEEHGDLTKIAPHRSLKVGEAVNKA 

ELKMDERGTEGAAGTGAQTLPMETPLVVKIDKP 

YLLLIYSEKJPSVLFLGKIVNPIGK 

3529 

A 

1 

5684 

VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETIIQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYIIQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVVIRPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTOKDKKIRMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEffiNFSLTVNPLSDRLSL 

LSTSSETIPMVVSDFDLPDQQffilLQSSDSGCSQSS 

AGDNLSYEVDPETVNAQEDSQMPKESSPDDDVQ 

nvvFni iri^vv^ni pvp^a^vt^ot ftfamppkt 1 

v v r v v ovjijii v cojt\lj v x k3v^i^Juii_//vivijrjriVv^ 

SDIDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISRNSSSPCISGTTHTLHDSSVAS 

IETKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 

FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alantne C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I-Isoleucine, K=Lysine, L=Leucinc, M=Methionine, 
N=Asparaginc, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





NFNTHPLYQHVLLYLQLYDSSRTLYAFSAIKA1LK 

TNPIAFVNAISTTSVNNAYTPQLSLLQNLLARHRI 

SVMGKDFYSH1PVDSNHNFRSSMYIEILISLCLYY 

MRSHYPTHVKVTAQDLIGNRNMQMMSIEILTLL 

FTELAKV1ESSAKGFPSFISDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNGSTLQSQLLKVLQRLIV\LEHRVM\T 

IPEE\NETGFDFVVS\DLEHISPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRAVLHQHCACKMHPQWIGLIT 

STLPYMGKVLQRVWSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMASnPPDMILTLLEGlTAIIHYC 

LLDPTTQYHQLLVSVDQKHLFEARSG1LSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 

ATKNLRQQILELLGPISMNHGVHFMAAIAFVWN 

ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAETV1QTVKEVLKQPPAIAKDKKHLSLEVCML 

QFFYAYIQRIPVPNLVDSWASLLILLKDSIQLSLP 

APGQFLILGVLNEFIMKNPSLENKKDQRDLQDVT 

HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 

VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 

TLLSEVLAHLLDMVFYSDEKERVIPLLVNIN4HYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFAIFSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTM1TELVQVFLLMEQELTADEDISRTSGPSVA 

ULDl 1 I 1 uuiN Ur o 1 o I IN ol^K WLIN L I LoALKr LU 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 
QGIHQREFKPYVVRLAKLLRKRAKKNPEEDNSG 
RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 
NSKVTSRCG GHSGSPILYSNAFPNKDMKLENHKP 
CSSKARQKIEEMVEKDFLEGMIKT 

3530 

A 

1 

5684 

VSSVSHENPTEVFEDGENPPSSRSSESGFTEFIQY 

QADRTDDIDRELSEGQGAAAIPIGSTSSETETAST 

VGSEETIIQTPSVVTQGTATRSRKTAQKTAMQCC 

LEYVQQFLTRLINLYIIQNNSFSQSLATEHQGDLG 

REQGETSKWDRNSQGDVKEKNISKQKTSKEYLS 

AFLAACQLFLECSSFPVYIAEGNHTSELRSEKLET 

DCEHVQPPQWLQTLMNACSQASDFSVQSVAISL 

VMDLVGLTQSVAMVTGENINSVEPAQPLSPNQG 

RVAVVIRPPLTQGNLRYIAEKTEFFKHVALTLWD 

QLGDGTPQHHQKSVELFYQLHNLVPSSSICEDVI 

SQQLTHKDKKIRMEAHAKFAVLWHLTRDLHINK 

SSSFVRSFDRSLFIMLDSLNSLDGSTSSVGQAWL 

NQVLQRHDIARVLEPLLLLLLHPKTQRVSVQRV 

QAERYWNKSPCYPGEESDKHFMQNFACSNVSQ 

VQLITSKGNGEKPLTMDEIENFSLTVNPLSDRLSL 

LSTSSET1PMVVSDFDLPDQQIEILQSSDSGCSQSS 

AGDNLSYFVDPETVNAOEDSOMPKESSPDDDVO 

QVVFDLICKVVSGLEVESASVTSQLEIEAMPPKC 

SDBDPDEETIKIEDDSIQQSQNALLSNESSQFLSVS 

AEGGHECVANGISlWSSSPCISGTTHTLrlDSSVAS 

ffiTKSRQRSHSSIQFSFKEKLSEKVSEKETIVKESG 

KQPGAKPKVKLARKKDDDKKKSSNEKLKQTSV 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
■location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Iso!eucine, K=Lysine, L=Leucine, M-Methionine, 
N-Asparagine, P=Prolinc, Q=Glutamine, R-Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
V=posstble nucleotide insertion 





FFSDGLDLENWYSCGEGDISEIESDMGSPGSRKSP 

NFNIHPLYQHVLLYLQLYDSSRTLYAFSAIKAILK 

TNPIAFVNAISTTSVNNAYTPQLSLLQNLLARHRJ 

SVMGKDFYSHIPVDSNHNFRSSMYiEILISLCLYY 

MRSHYPTHVKVTAQDLIGNRNMQMMSIEILTLL 

FTELAKVIESSAKGFPSFISDMLSKCKVQKVILHC 

LLSSIFSAQKWHSEKMAGKNLVAVEEGFSEDSLI 

NFSEDEFDNG STLQSQLLKVLQRLI V\LEHRVM\T 

IPEE\NETGFDFVVS\DLEHISPHQPMTSLQYLHAQ 

SITCQGMFLCAVIRAVLHQHCACKMHPQWIGLIT 

STLPYMGKVLQRVVVSVTLQLCRNLDNLIQQYK 

YETGLSDSRPLWMAS1IPPDMILTLLEGITAIIHYC 

LLDPTTQYHQLLVSVDQKHLFEARSGILSILHMI 

MSSVTLLWSILHQADSSEKMTIAASASLTTINLG 

ATKNLRQQILELLGPISMNHGVHFMAAIAFVWN 

ERRQNKTTTRTKVIPAASEEQLLLVELVRSISVM 

RAETVIQTVKEVLKQPPAIAKDKKHLSLEVCML 

QFFYAYIQR1PVPNLVDSWASLLILLKDSIQLSLP 

APGQFL1LGVLNEFIMKNPSLENKKDQRDLQDVT 

HKIVDAIGAIAGSSLEQTTWLRRNLEVKPSPKIM 

VDGTNLESDVEDMLSPAMETANITPSVYSVHAL 

TLLSEVLAHLLDMVFYSDEKERV1PLLVNIMHYV 

VPYLRNHSAHNAPSYRACVQLLSSLSGYQYTRR 

AWKKEAFDLFMDPSFFQMDASCVNHWRAIMDN 

LMTHDKTTFRDLMTRVAVAQSSSLNLFANRDVE 

LEQRAMLLKRLAFA1FSSEIDQYQKYLPDIQERLV 

ESLRLPQVPTLHSQVFLFFRVLLLRMSPQHLTSL 

WPTM1TELVQVFLLMEQELTADEDISRTSGPSVA 

GLETTYTGGNGFSTSYNSQRWLNLYLSACKFLD 

LALALPSENLPQFQMYRWAFIPEASDDSGLEVRR 

QG1HQREFKPYVVRLAKLLRKRAKKNPEEDNSG 

RTLGWEPGHLLLTICTVRSMEQLLPFFNVLSQVF 

NSKVTSRCGGHSGSPILYSNAFPNKDMKLENHKP 

CSSKARQKIEEMVEKDFLEGMIKT 

3531 

A 

553 

2470 

LISPSPALSSQDPALSLKENLEDISGWGLPEARSK 

ESVSFKDVAVDFTQEEWGQLDSPQRALYRDVM 

LENYQNLLALGPPLHKPDVISHLERGEEPWSMQ 

REVPRGPCPEWELKAVPSQQQGICKEEPAQEPIM 

ERPLGGAQAWGRQAGALQRSQAAP\GR\RTCHG 

LGRPWEEFPLRCPLFAQQRVPEGGPLLDTRKNV 

QATEGRTKAPARLCAGENASTPSEPEKFPQVRRQ 

RGAGAGEGEFVCGECGKAFRQSSSLTLHRRWHS 

REKAYKCDECGKAFTWSTNLLEHRRIHTGEKPFF 

CGECGKAFSCHSSLNVHQRIHTGERPYKCSACEK 

AFSCSSLLSMHLRVHTGEKPYRCGECGKAFNQR 

THLTRHHRIHTGEKPYQCGSCGKAFTCHSSLTVH 

EKIHSGDKPFKCSDCEKAFNSRSRLTLHQRTHTG 

EKPFKCADCGKGFSCHAYLLVHRRMSGEKPFK 

NECGKAFSSHAYLIVHRRIHTGEKPFDCSQCWKA 

IKHQKIHSGEKSFKCEKCGEMFNWSSHLTEHQRL 
HSEGKPLAIQFNKHLLSTYYVPGSLLGAGDAGLR 
DVDPIDALDVAKLLCVVPPRAGRNFSLGSKPRN 

3532 

A 

3931 

317 

HRELQDSPSAEPPAGSMPLRHWGMARGSKPVGD 
GAQPMAAMGGLKVLLHWAGPGGGEPWVTFSES 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=-Alanine C=Cysteine, D=Aspartic Acid, 
£=Ctutamic Acid, ^Phenylalanine, G=GIycine, H=Histidine, 
I=Isolcucine, K=Lysine, Lr=Lcutine, M=Mcthionine, 
N=Asparaginc, P=Proline, Q=Glutamine, I^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=l)nknown, *=Stop codon, /=possiblc nucleotide deletion, 
\=possible nucleotide insertion 





SLTAEEVCIHIAHKVG1TPPCFNLFALFDAQAQV 

WLPPNHILEIPRDASLMLYFVRHRFYSR\NWHGM 

NPREPAVYRCGPPGTEASSDQTAQGMQLLDPAS 

FEYLFEQGKHEFVNDVASLWELSTEEEIHHFKNE 

SLGMAFLHLCHLALRHGIPLEEVAKXTSFKDCIP 

RSFRRHIRQHSALTRLRLRNVFRRf LRDFQPGRLS 

QQMVMVKYLATLERLAPRFGTERVPVCHLRLLA 

QAEGEPCYIRDSG VAPTDPGPESAA GPPTHE VL V 

TGTGGIQWWPVEEEVNKEEGSSGSSGRNPQASL 

FGKKAKAHKAFGQPADRPREPLGAYFCDFRD1T 

HVGLKEHCVSIHRQDNKCLELSLPSRAAALSFVS 

LVDGYFRLTADSSHYLCHEVAPPRLVMS1RDG1H 

GPLLEPFVQAKLRPEDGLYLIHWSTSHPYRLILTV 

AQRSQAPDGMQSLRLRKFP1EQQDGAFVLEGWG 

RSFPSVRELGAALQGCLLRAGDDCFSLRRCCLPQ 

PGETSNLIIMRGARASPRTLNLSQLSFHRVDQKEI 

TQLSHLGQGTRTN V YEGRLRVEGSG DPEEGKMD 

DEDPLVPGRDRGQELRVVLKVLDPSHHDIALAF 

YETASLMSQVSHTHLAFVHGVCVRGPENIMVTE 

YVEHGPLDVWLRRERGHVPMAWKMVVAQQLA 

SALSYLENKNLVHGNVCGRNILLARLGLAEGTSP 

FIKLSDPGVGLGALSREERVERIPWLAPECLPGG 

ANSLSTAMDKWGFGATLLEICFDGEAPLQSRSPS 

EKEHFYQRQHRLPEPSCPQLATLTSQCLTYEPTQ 

RPSFRTILRDLTRLQPHNLADVLTVNPDSPASDPT 

VFHKRYLKK1RDLGEGHFGKVSLYCYDPTNDGT 

GEMVAVKALKADCGPQHRSGWKQEIDILRTLYH 

EHIIKYKGCCEDQGEKSLQLVMEYVPLGSLRDYL 

PRR^irnr A OT T T FAOnTPPfiMAVT W A rH-IVIURHl 
a xvrioiwij/vv^i^ijijr /\v^v^ii^iioiVjL/\ i i^rl/W^ri i liit\LJL> 

AARNVLLDNDRLVKIGDFGLAKAVPEGHEYYRV 

REDGDSPVFWYAPECLKEYKFYYASDVWSFGVT 

LYELLTHCDSSQSPPTKFLELIGIAQGQMTVLRLT 

ELLERGERLPRPDKCPCEV YHLMKNC WETEASF 

RPTFENLIPILKTVHEKYQGQAPSVFSVC 

3533 

A 

182 

3465 

FRWLDFFRGSINSQFEFGRKKENMTSPAKPKKDK 

EIIAEYDTQVKEIRAQLTEQMKCLDQQCELRVQL 

LQDLQDFFRKKAEDEMDYSRNLEKLAERFLAKT 

RSTKDQQFKKDQNVLSPVNCWNLLLNQVKRES 

RDHTTLSDIYLNNIIPRFVQVSEDSGRLFKKSKEV 

GQQLQDDLMKVLNELYSVMKTYHMYNADSISA 

QSKLKEAEKQEEKQIGKSVKQEDRQTPRSPDSTA 

NVWEEKHVRRSSVKKIEKMKEKRQAKYTENKL 

KAIKARNEYLLALEATNASVFKYYIHDLSDLIDQ 

CCDLGYHASLNRALRTFLSAELNLEQSKHEGLD 

AIENAVENLDATSDKQRLMEMYNNVFCPPMKFE 

FQPHMGDMASQLCAQQPVQSELLQRCLQLQSRL 

STLKIENEEVKKTMEATLQTIQDIVTVEDFDVSD 

CFQYSNSMESVKSTVSETFMSKPSIAKRRANQQE 

TEQFYFTKMKEYLEGRNLITKLQAKHDLLQKTL 

GESQRTDCSLARRSSTVRKQDSSQAIPLVVESCIR 

FISRHGLQHEGIFRVSGSQVEVNDIKNAFERGEDP 

LAGDQNDHDMDSIAGVLKLYFRGLEHPLFPKDIF 

HDLMACVTMDNLQERALHIRKVLLVLPKTTLII 

MRYLFAFLNHLSQFSEENMMDPYNLA1CFGPSL 

MSVPEGHDQVSCQAHVNELIKTniQHENIFPSPRE 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 





LEGPVYSRGGSMEDYCDSPHGETTSVEDSTQDV 

TAEHHTSDDECEPIEAIAKFDYVGRTARELSFKK 

GASLLLYQRASDDWWEGRHNGIDGLIPHQY1VV 

QDTEDGVVERSSPKSEIEVISEPPEEKVTARAGAS 

CPSGGHVADIYLAN1NKQRKRPESGSIRKTFRSDS 

HGLSSSLTDSSSPGVGASCRPSSQPIMSQSLPKEG 

AGRSKSFDNHRPMDPEVIAQDIEATMNSALNELR 

ELERQSSVKHTPDWLDTLEPLKTSPVVAPTSEPS 

SPLHTQLLKDPEPAFQRSASTAGDIACAFRPVKS 

VKMAAPVKPPATVRPKPTVVFPKTNATSPGVNSST 

SPQSTDKSCTV 

3534 

A 

1 

2640 

FRRFVCPASRRPAAGLRDAASSAPRGMASEGPRE 

PESEGIKLSADVKPF VPRF AGLNVA WLESSEACV 

FPSSAATYYPFVQEPPVTEQKIYTEDMAFGASTFP 

PQYLSSEITLHPYAYSPYTLDSTQNVYSVPGSQY 

LYNQPSCYRGFQTVKHRNENTCPLPQEMKALFK 

KKTYDEKKTYDQQKFDSERADGTISSEIKSARGS 

HHLSIYAENSLKSDGYHKRTDRKSRIIAKNVSTS 

KPEFEFTTLDFPELQGAENNMSEIQKQPKWGPVH 

SVSTDISLLREVVKPAAVLSKGEIVVKNNPNESV 

TANAATNSPSCTRELSWTPMGYVVRQTLSTELS 

AAPKNVTSMINLKTIASSADPKNVSIPSSEALSSD 

PSYNKEKHIIHPTQKSKASQGSDLEQNEASRKNK 

KKKEKSTSKYEVLTVQEPPRIEDAEEFPNLAVAS 

ERRDRIETPKFQSKQQPQDNFKNNVKKSQLPVQL 

DLGGMLTALEKKQHSQHAKQSSKPVVVSVGAV 

PVLSKECASGERGRRMSQMKTPHNPLDSSAPLM 

KKGKQREIPKAKKPTSLKKIILKERQERKQRLQE 

NAVSPAFTSDDTQDGESGGDDQFPEQAELSGPEG 

MDELISTPSVEDKSEEPPGTELQRDTEASHLAPN 

HTTFPKIHSRRFRDYCSQMLSKEVDACVTDLLKE 

LVRFQDRMYQKDPVKAKTKRRLVLGLREVLKH 

t vi v vi vr t \/nQT>KTr i T7VTr\QvnnT nim wtiftyva 

Ll\J^JMSJLrvC V llorlN L,c<iSJ.V^oJ^U O LrjJlJ 1 J_»rl 1 IID I s\ 

CEQNIPFWALNRKALGRSLNKAVPVSVVGIFSY 
DGAQDQFHKMVELTVAARQAYKTMLENVQQE 
LVGEP\SLRHLPAYPHRAPAALQKMAPQP/VKEK 
EEPHYIEIWKKHLEAYSGCTLELEESLEASTSQM 
MNLNL 

3535 

A 

1747 

983 

LFQFQVCRSVLSPRAAGCTWSLAPRSRGAAGSPR 
RYRGPQPQPAPPSALPNSRPSPVASGREMVVLSV 

PAT7VTVTT I FlTPnTTTPT A FVKDTT FPYiFFNVKFY 
r/\tli V 1 V LLtLslJiEi\J 111 Jrl/VT V J\_LJll_»r r I 1JDOIN V JVC I 

LQTHWEEEECQQDVSLLRKQWADVVPAVRKW 

REAGMKVYIYSSGSVEAQKLLFGHSTEGDILELV 

DGHFDTKIGHKVESESYRKIADS1GCST3WILFLT 

DVTREASAAEEADVHVAWVRPGNAGLTDDEK 

TYYSLITSFSELYLPSST 

3536 

A 

3 

1302 

GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTS 

IESRGRPAASAGLRRDRCALRRWPLRRAPLARAT 

RRR AGSPRRCAPRPRACPOGWSRARHOPGGLCL 

LLLLLCQFMEDRSAQAGNCWLRQAKNGRCQVL 

YKTELSKEECCSTGRLSTSWTEEDVNDNTLFKW 

MrP^GGAPNCIPCKETCENVDCGPGKXCRMNKK 

NKPRCVCAPDCSNITWKGPVCGLDGKTYRNECA 

LLKARCKEQPELEVQYQGRCKKTCRDVFCPGSS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycinc, H=Histidtne, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion [ 





TCVWDQTNNAYCVTCNR1CPEPASSEQYLCGND 
GVTYS\SACHLRKATCLLGRSIGLAYEGKCIKAK 
SCEDIQCTGGKXCLWDFKVGRGRCSLCDELCPD 
SKSDEPVCASDNATYASECAMKEAACSSGVLLE 
VKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 

3537 

A 

285 

2123 

IGLFLQVAPLSVMAKSCPSVCRCDAGFIYCNDRF 

LTSIPTGIPEDATTLYLQNNQINNAG1PSDLKNLL 

KVERIYLYHNSLDEFPTNLPKYVKELHLQENNIR 

TITYDSLSK1PYLEELFLLDDNSVSAVSIEEGAFRD 

SNYLRLLFLSRNHLSTIPWGLPRTIEELRLDDNRIS 

TISSPSLQGLTSLKRLVLDGNLLNNHGLGDKVFF 

NLVNLTELSLVRNSLTAAPVNLPGTNLRKLYLQ 

DNHINRVPPNAFSYLRQLYRLDMSNNNLSNLPQ 

GIFDDLDN1TQLILRNNPWYCGCKMKWVRDWL 

QSLPVKVNVRGLMCQAPEKVRGMAIKDLNAELF 

DCKDSGIVSTIQITTAIPNTVYPAQGQWPAPVTK 

QPDIKNPKLTKDHQTTGSPSRKTITITVKSVTSDTI 

TTTCITJT/'I A I Til *T A I T» T Oil 7T t/l 1 ffin A TT< OTTFTl'l fT 

HISWI<XALPMTALRLSWLI<XGHSPAFGSITETIV 

GERSEYLVTALEPDSPYKVCMVPMETSNLYLFD 

ETPVCIETETAPLRMYNPTTTLNREQEKEPYKNP 

NLPLAAIIGGAVALVT1ALLALVCWYVHRNGSLF 

SRNCAYSKGFIRRXDDYAEAGTKKDNSILEIRETS 

FQMLPISNEPISKEEFVIHTIFPPNGMNLYKNNH 

3538 

A 

877 

6184 

WNVKPSLLVVQLFKFSDKEEHEQNDSISGKTGET 

GVEEMIATRKVEQDSKETVKLSHEDDHILEDAGS 

SDISSDAACTNPNKTENSLVGLPSCVDEVTECNL 

ELKDTMGIADKTENTLERNKIEPLGYCEDAESNR 

QLESTEFNKSNLEVVDTSTFGPESNILENAICDVP 

DQNSKQLNAIESTKIESHETANLQDDRNSQSSSV 

SYLESKSVKSKHTKPVIHSKQNMTTDAPKKIVAA 

KYEVIHSKTKVNVKSVKJWTDVPESQQNFHRPV 

KVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKK 

TLQDQTLVQIFKPLTHSLSDKSHAHPGCLKEPHH 

PAQTGHVSHSSQKQCHKPQQQAPAMKTNSHVK 

EELEI^GVEHFKEEDKLKLKKPEKNLQPRQRRSS 

KSFSLDEPPLFIPDNIATIRREGSDHSSSFESKYMW 

TPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDC 

VGLSLSQAQQMGEEDKEYVCVKCCAEEDKKTEI 

LDPDTLENQATVEFHSGDKTMECEKLGLSKHTT 

NDRTKYIDDTVKHKVKJLKRESGEGRNSSDCRD 

NEIKKWQLAPLRJCMGQPVLPRRSSEEKSEKIPKE 

STTVTCTGEKASKPGTHEKQEMKKK^ 

NVrlPAASASKPSADQniQSVRHSLKDILMKJRLTD 

SNLKVPEEKAAKVATKIEKELFSFFRDTDAKYKN 

KYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIR 

MSPEELASKELAAWRRRENRHT1EMIEKEQREVE 

RRPITKITHKGEIEffiSDAPMKEQEAAMEIQEPAA 

NKSLEKPEGSEK\RKEEVDSMSKDTTSQ1WQHLF 

DLNCKICIGRMAPPVDDLSPKKVXVVVGVARKH 

olJlNliAliolAlJAL.oo 1 oXNIljAoiirrtiJDlilVV^Cor Jvo 1 r 

SPAPr^EMPGTVEVESTFLARLNnWKGFINMPS 

VAKFVTKAYPVSGSPEYLTEDLPDSIQVGGRISPQ 

T\nVDYVEKIKASGTKEICVVRFTPVTEEDQISYT 

LLFAWSSRKRYGVAANOTv4KQVra)MYLIPLGAT 

DKIPHPLVPFDGPGLELHRPNLLLGLIIRQKLKRQ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, KHLysine, LHLe urine, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S==Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /=possiblc nucleotide deletion, 
\=possible nucleotide insertion 





HSACASTSHIAETPESAPPIALPPDKKSKIEVSTEE 

APEEENDFFNSFTTVLHKQRNKPQQNLQEDLPTA 

VEPLMEVTKQEPPKPLRFLPGVLIGWENQPTTLE 

LANKPLPVDDILQSLLGTTGQVYDQNAQSVMEQ 

NTVKEIPFLNEQTNSKIEKTDNVEVTDGENKEIK 

VKVDNISESTDKSAEIETSVVGSSSISAGSLTSLSL 

RGKPPDVSTEAFLTNLSIQSKQEETVESKEKTLKR 

QLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGN 

VSCSENLVANTARSPQFINLKRDPRQAAGRSQPV 

TTSESKDGDSCRNGEKHMLPGLSHNKEHLTEQIN 

VEEKLCSAEKNSCVQQSDNLKVAQNSPSVENIQT 

SQAEQAKPLQEDILMQNIETVHPFRRGSAVATSH 

FEVGNTCPSEFPSKS1TFTSRSTSPRTSTNFSPMRP 

QQPNLQHLKSSPPGFPFPGPPNFPPQSMFGFPPHL 

PPPLLPPPGFG\FA\QNPMVPWPPVV\HLP\GQPQR 

MMGPLSQASRYIGPQNFYQVKDIRRPERRHSDP 

WGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKR 

ERKEKEWEQESERHRRRDRSQDKDRDRKSREEG 

HKDKERARLSHGDRGTDGKASRDSRNVDKKPD 

KPKSEDYEKDKEREKSKHREGEKDRDRYHKDR 

DHTDRTKSKR 

3539 

A 

157 

1769 

GSWTVELSLKPSASPSLKWVCLPGAAAVNKHRS 

GAGGLIRSLIQCTWAPAGPARRGGRGIEDFPYLF 

FQLTHCQQR1CSVTQAGVQWCDHSSLQPQTPGL 

NQSSHLSLLSSRDYRMLSSFNEWFWQDRFWLPP 

NVTWTELEDRDGRVYPHPQDLLAALPLALVLLA 

MRLAFERFIGLPLSRWLGVRDQTRRQVKPNATL 

EKHFLTEGHRPKEPQLSLLAAQCGLTLQQTQRW 

FRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGL 

SVLYHESWLWAPVMCWDRYPNQLTLSCPAADS 

EAXSLYWWYLLELGFYLSLLIRLPFDVKRKGGGP 

SSIKPRPHYDPPSTA\DFKEQVIHHFVAVILMTFSY 

SANLLRIGSLVLLLHDSSDYLLEACKMVNYMQY 

QQVCDALFLIFSFVFFYTRLVLFPTQILYTTYYESI 

SNRGPFFGYYFFNGLLMLLQLLHVFWSCLILRML 

YSFMKKGQMEICDIRSDVEESDSSEEAAAAQEPL 

QLKNGTAGGPRPAPTDGPRSRVAGRLTNRHTTA 

T 

3540 

A 

267 

1397 

SPAGYCHSGLLPGCSRSA/CADLAKHQELPGKKL 

LSEKKLKRYFVDYRRVLVCGGNGGAGASCFHSE 

PRKEFGGPDGGDGGNGGHVILRVDQQVKSLSSV 

LSRYQGFSGEDGGSKNCFGRSGAVLYIRVPVGTL 

VKEGGRVVADLSCVGDEYIAALGGAGGKGNRF 

FLANNNRAPVTCTPGQPGQQRVLHLELKTVAHA 

GMVGFPNAGKSSLLRAISNARPAVASYPFTTLKP 

HVGIVHYEGHLQIAVADIPGIIRGAHQNRGLGSA 

FLRHIERCRFLLFVVDLSQPEPWTQVDDLKYELE 

MYEKGLSARPHAIVANKIDLPEAQANLSQLRDH 

LGQEVIVLSALTGENLEQLLLHLKVLYDAYAEA 

JiLO vvjK^Jt JL.K W 

3541 

A 

1 

8008 

DTQVSETLKRFAGKVTTASVKERREILSELGKCV 

AGKDLPEGAVKGLCKLFCLTLHRYRDAASRRAL 

QAAIQQLAEAQPEATAKNLLHSLQSSGIGSKAGV 

PSKSSGSAALLALTWTCLLVRIVFPSRAKRQGD1 

WNKLVEVQCLLLLEVLGGSHKHAVDGAVKKLT 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycinc, H=Histidine, 
I=Isoleucine, K-Lysine, U=Lcucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
\=possible nucleotide insertion 




i 

KLWKENPGLVEQYLSAILSLEPNQNYAGMLGLL 

VQFCTSHKEMDVVSQHKSALLDFYMKNILMSK 

VKPPKYLLDSCAPLLRYLSHSEFKDLTLPTIQKSL 

LRSPENVIETISSLLASVTLDLSQYAMDIVKGLAG 

HLKSNSPRLMDEAVLALRNLARQCSDSSAMESL 

TKHLFAILGGSEGKLTVVAQKMSVLSG1GSVSHH 

VVSGPSSQVLNGIVAELFIPFLQQEVHEGTLVHA 

VSVLALWCNRPTMEVPKKLTEWFKKAFSLKTST 

SAVRHAYLQCMLASYRGDTLLQALDLLPLLIQT 

VEKAASQSTQVPTITEGVAAALLLLKLSVADSQA 

EAKLSSFWQLIVDEKKQVFTSEKFLVMASEDAL 

CTVLH\LTERLFLDHPHRLTGNKVQQYHRALVA 

VLLSRTWHVRRQAQQTVRKLLSSLGGFKLAHGL 

LEELKTVLSSHKVLPLEALVTDAGEVTEAGKAY 

VPPRVLQEALCVISGVPGLKGDVTDTEQLAQEM 

LIISHHPSLVAVQSGLWPALLARMKIDPEAFITRH 

LDQIIPRMTTQSPLNQSSMNAMGSLSVLSPDRVL 

PQLISTITASVQNPALRLVTREEFAIMQTPAGELY 

DKSIIQSAQQDSIKKANMKRENKAYSFKEQIIELE 

LKEEIKKKKGIKEEVQLTSKQKEMLQAQLDREA 

QVRRRLQELDGELEAALGLLDIILAKNPSGLTQYI 

PVLVDSFLPLLKSPLAAPRIKNPFLSLAACVMPSR 

LKALGTLVSHVTLRLLKPECVLDKSWCQEELSV 

AVKRAVMLLHTHTITSRVGKGEPGAAPLSAPAFS 

LVFPFLKMVLTEMPHHSEEEEEWMAQILQILTVQ 

AQLRASPNTPPGRVDENGPELLPRVAMLRLLTW 

VIGTGSPRLQVLASDTLTTLCASSSGDDGCAFAE 

QEEVDVLLCALQSPCASVRETVLRGLMELHMVL 

PAPDTDEKNGLNLLRRLWVVKPDKEEEIRJCLAE 

RLWSMMGLDLQPDLCSLLIDDVIYHEAAVRQAG 

AEALSQAVARYQRQAAEVMGRLMEIYQEKLYR 

PPPVLDALGRVISESPPDQWEARCGLALALNKLS 

QYLDSSQVKPLFQFFVPDALNDRHPDVRKCMLD 

AALATLNTHGKENVNSLLPVFEEFLKNAPNDAS 

YDAVRQSVVVLMGSLAKHLDKSDPKVKPIVAKL 

IAALSTPSQQVQESVASCLPPLVPAIKEDAGGMIQ 

RLMQQLLESDKYAERKGAAYGLAGLVKGLGILS 

LKQQEMMAALTDAIQDKKNFRRREGALFAFEM 

LCTMLGKLFEPYVVHVLPHLLLCFGDGNQYVRE 

AADDCAKAVMSNLSAHGVKLVLPSLLAALEEES 

WRTKAGSVELLGAMAYCAPKQLSSCLPNIVPKL 

TEVLTDSHVKVQKAGQQALRQIGSVIRNPEILAI 

APVLLDALTDPSRKTQKCLQTLLDTKFVHFIDAP 

SLALIMPIVQRAFQDRSTDTRKMAAQIIGNMYSL 

TDQKDLAPYLPSVTPGLKASLLDPVPEVRTVSAK 

ALGAK4VKGMGESCFEDLLPWLMETLTYEQSSV 

DRSGAAQGLAEVMAGLGVEKLEKLMPEIVATAS 

KVDIAPHVRDGYIM3V1FNYLPITFGDKFTPYVGPII 

PCILKALADENEFVRDTALRAGQRVISMYAETAI 

VTGKMTTETASEDDNFGTAQSNKAIITALGVERR 

NRVLAGLYMGRSDTQLVVRQASLHVWKIVVSN 

TPRTLREDLPTLFGLLLGFLASTCADKRTIAARTL 

GDLVRKLGEKILPEJJPILEEGLRSQKSDERQGVCI 

GLSEIMKSTSRDAVLYFSESLVPTARKALCDPLE 
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SEQ ID 

jN*j: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GI« Jamie Acid, ^Phenylalanine, G^GIycine, H=Histidine, 
I=Isoleucine, K=Lysine, Lr=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





EVREAAAKTFEQLHSTIGHQALED1LPFLLKQLD 

DEEVSEFALDGLKQVMAIKSRVVLPYLVPKLTTP 

PVNTRVLAFLSSVAGDALTRHLGVILPAVMLAL 

KEKLGTPDEQLEMANCQAVILSVEDDTGHRIIIE 

DLLEATRSPEVGMRQAAAIILNIYCSRSKADYTS 

HLRSLVSGLIRLFNDSSPVVLEESWDALNAITKK 

LDAGNQLALIEELHKEIRLIGNESKGEHVPGFCLP 

KKGVTSILPVLREGVLTGSPEQKEEAAKALGLVI 

RLTSADALRPSVVSITGPLIRILGDRFSWNVKAAL 

LETLSLLLAKVGIALKPFLPQLQTTFTKALQDSNR 

GVRLKAADALGKLISIHIKVDPLFTELLNGIRAME 

DPGVRDTMLQALRFVIQGAGAKVDAV1RKNIVS 

LLLSMLGHDEDNTRISSAGCLGELCAFLTEEELS 1 

AVLQQCLLADVSGIDWMVRHGRSLALSVAVNV 

APGRLCAGRYSSDVQEMILSSATADRJPIAVSGV 

RGMGFLMRHHIETGGGQLPAKLSSLFVKCLQNP 

SSDIRLVAEKMIWWANKDPLPPLDPQAIKPILKA 

LLDNTKDKNTVVRAYSDQAIVNLLKMRQGEEVF 

QSLSKILDVASLEVLNEVNRRSLKKLASQADSTE 

QVDDTILT 

3542 

A 

62 

1130 

PWNPQDFPGNRGLMG\QKGEJGPP\GQQGKKGAP 

GMP\GLMGSNGSPGQPGTPGSKGSKGEPGIQGMP 

GASGLKGEPGATGSPGEPGYMGLPGIQGKKGDK 

GNQGEKGIQGQKGENGRQGIPGQQGIQGHHGAK 

GERGEKGEPGVRGAIGSKGESGVDGLMGPAGPK 

GQPGDPGPQGPPGLDGKPGREFSEQFIRQVCTDV 

1RAQLPVLLQSGRIRNCDHCLSQHGSPGIPGPPGPI 

GPEGPRGLPGLPGRDGVPGLVGVPGRPGVRGLK 

GLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGI 

SKEGPPGDPGLPGKDGDHGKPGIQGQPGPPGICD 

PSLCFSVIARRDPFRKGPNY 

3543 

A 

654 

194 

PARSLEKMKASVVLSLLGYLVVPSGAYILGRCTV 

AKKLHDGGLDYFERYSLENWVCLAYFESKFNPS\ 

AIYENTREGYTGFGLFQMRGSDWCGDHGRNRC 

HMSCSALLNPNLEKTIKCAKTIVKGKEGMGAWP 

TWSRYCQYSDTLARWLDGCKL 

3544 

A 

2 

1074 

SCRLAAGRLAQWLLRASRSGMLRAGWLRGAAA 

LALLLAARVVAAFEPITVGLAIGAASAITGYLSY 

NDIYCRFAECCREERPLNASALKLDLEEKLFGQH 

LATEVI\FKALTGFRNNKNPKKPLTLSLHGWAGT 

GKNPVSQMGAENLHPKGLKSNFVHLFVSTLHFP 

HEQKIKLYQDQLQKWIRGNVSACANSVFIFDEM 

DKL\HPGIIE\AIKPFLDYYEHVERVSYR\KAIFIFLS 

NAGGDLITKTALDFWRAGRKREDIQLKDLEPVL 

SVGVFNNKHSGLWHSGLIDKNLIDYFIPFLPLEYR 

HVKMCVRA£MRARGSAIDEDIVTRVAEEMTFFP\ 

RDEKIYSDKGCKTVQSRLDFH 

3545 

A 

3 

273 

SAQGRSWGRFYRQIKRHPGIIPMIGLICLGMGSA 
ALYLLRLALRSPDV W* S WDRKNNPEP WNRLSPN 
DQYKFLAVSTDYKKLKKDRPDF 

3546 

A 

23 

591 

ALSTETRTPDMRRLLLVTSLVVVLLWEAGAVPA 
PKVPIKMQVKHWPSEQDPEKAWGARVVEPPEK 
DDQLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 
KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 
EERPRL WVMPNHQ VLLGPEEDQDHIYHPQ * GSR 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
OGIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N-Asnarapine P=Prftline. 0=frIiifaniinp R=Aroinin* C=Q» r inp 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /=possibtc nucleotide deletion, 
V=possible nucleotide insertion 





GHHCPRPVPRPRLLGLGPSLPCPS 

3547 

A 

23 

591 

ALSTETRTPDMRRLLLVTSLVVVLLWEAGAVPA 

PKVPIKMQVKHWPSEQDPEKAWGARVVEPPEK 

DDQLVVLFPVQKPKLLTTEEKPRGQGRGPILPGT 

KAWMETEDTLGRVLSPEPDHDSLYHPPPEEDQG 

EERPRLWVMPNHQVLLGPEEDQDHIYHPQ*GSR 

GHHCPRPVPRPRLLGLGPSLPCPS 

3548 

A 

3 

1641 

TWLPSVPAEEVQQPEMAAVLNAERLEVSVDGLT 

LSPDPEERPGAEGAPLAAATAATALATWIRSRPG 

RLRGTARSPGRRAAGGAAEEARRLEQRWGFGLE 

ELYGLALRFFKEKDGKAFHPTYEEKLKLVALHK 

QVLMGPYNPDTCPEVGFFDVLGNDRKREWAAL 

GNMSKEDAMVEFVKLLNRCCHLFSTYVASHKIE 

KEEQEKKRKEEEERRRREEEERERLQKEEEKRRR 

EEEERLRREEEERRRIEEERLRLEQQKQQIMAAL 

NSQTAVQFQQYAAQQYPGNYEQQQILIRQLQEQ 

HYQQYMQQLYQVQLAQQQAALQKQQEVVVAG 

SSLPTSSKVECNCTQVI*CQFNRQAKTHTDSSEKE 

LEPEAAEEALENGPKESLPVIAAPSMWTRPOIKD 

FKEK1QQDADSVITVGRGEVVTVRVPTHEEGSYL 

FWEFATDNYD1GFGVYFEWTDSPNTAVSVHVSE 

SSDDDEEEEENIGCEEKAKKNANKPLLDEIVPVY 

RRDCHEEVYAGSHQYPGRGVYLLKFDNSYSLW 

RSKSVYYRVYYTR 

3549 

A 

1837 

3593 

PAVLVLEPASQSRKQQNTASATAQHWSAQIHKE 

SFLAPVFTKDEQKHRRPYEFEVERDAKARGLEQF 

SATHGHTPIILNGWHGESAMDLSCSSEGSPGATS 

PFPVSASTPKIGAISSLQGALGMDLSGILQAGLIHP 

VTGQIVNGSLRRDDAATRRRRGRRKHVEGGMD 

LfFLKEQTLQAGILEVHEDPGQATLSTTHPEGPGP 

ATSAPEPATAASSQAEKSIPSKSLLDWLRQQADY 

SLEVPGFGANFSDKPKQRRPRCKEPGKLDVSSLS 

GEERVPAIPKEPGLRGFLPENKFNHTLAEPILRDT 

GPRRRGRRPRSELLKAPSIVADSPSGMGPLFMNG 

L1AGMDLVGLQNMR>IMPG1PLTGLVGFPAGFAT 

MPTGEEVKSTLSMLPMMLPGMAAVPQMFGVGG 

LLSPPMATTCTSTAPA S LSSTTKSGTA VTEKTAE 

DKPSSHDVKTDTLAEDKPGPGPFSDQSEPAITTSS 

PVAFNPFLIPGVSPGLIYPSMFLSPGMGMALPAM 

QQARHSEIVGLESQFCrxJKKKKTKGDNPNSHPEPA 

PSCEREPSGDENCAEPSAPLPAEREHGAQAGEGA 

LKDSNNDTN 

3550 

A 

287 

39 

QLNLNKIATSQKHRDFVAESVGEKPVGSLAGIGE 
VMDKKLEEGCFDKAYVVLGQFLVLKKDEDLF*E 
WLRDTGGARTRGSRE 

3551 

A 

21 

3925 

GDLLEVGLPPGLEFPRGICLRGLRRTMSLDFGSV 

ALPVQNEDEEYDEEDYEREKELQQLLTDLPHDM 

LDDDLSSPELQYSDCSEDGTDGQPHHPEQLEMS 

WNEQMLPKSQSVNGPSCQGLEPYNKVTYKPYQS 

SAQNNGSPAQEITGSDTFEGLQQQFLGANENSAE 

NMQIIQLQVLNKAKERQLENLIEKLNESERQIRY 

LNHQLVIIKDEKDGLTLSLRESQKLFQNGKEREIQ 

LEAQIKALETQIQALKVNEEQMIKKSRTTEMALE 

SLKQQLVDLHHSESLQRAREQHESIVMGLTKKY 

EEQVLSLQKNLDATVTALKEQEDICSRLKDHVK 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va»ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possib!e nucleotide insertion 





QLERNQEAIKLEKTEIINKLTRSLEESQKQCAHLL 

QSGSVQEVAQLQFQLQQAQKAHAMSANMNKA 

LQEELTELKDEISLYESAAKLGIHPSDSEGELNIEL 

TESYVDLGIKKVNWKKSKVTS1VQEEDPNEELSK 

DEFILKLKAEVQRLLGSNSMKRHLVSQLQNDLK 

DCHKKIEDLHQVKKDEKSIEVETKTDTSEKPKNQ 

LWPESSTSDVVRDDILLLKNEIQVLQQQNQELKE 

TEGKLRNTNQDLCNQMRQMVQDFDHDKQEAV 

DRCERTYQQHHEAMKTQIRESLLAKHALEKQQL 

FEAYERTHLQLRSELDKLNKEVTAVQECYLEVC 

REKDNLELTLRKTTEKEQQTQEK1KEKLIQQLEK 

E WQSKLDQTIKAMKKKTLDCG SQTDQ VTTSDVI 

SKKEMAIMIEEQKCTIQQNLEQEKDIAIKGAMKK 

LEIELELKHCENITKQVEIAVQNAHQRWLGELPE 

LAEYQALVKAEQKKWEEQHEVSVNKRISFAVSE 

AKEKWKSELENMRKNILPGKELEEKIHSLQKELE 

LKNEEVPVVIRAELAKARSEWNKEKQEEIHRIQE 

QNEQDYRQFLDDHRNKINEVLAAAKEDFMKQK 

TELLLQKETELQTCLDQSRREWTMQEAKRIQLEI 

YQYEEDILTVLGVLLSDTQKEHISDSEDKQLLEI 

MSTCSSKWMSVQYFEKLKGC1QKAFQDTLPLLV 

ENADPEWKKRNMAELSKDSASQGTGQGDPGPA 

AGHHAQPLALQATEAEADKKKVLEIKDLCCGHC 

AVVEKIGEENNKVVEELIEENNDMKNKLEELQT 

LCKTPPRSLSAGAIENACLPCSGGALEELRGQYIK 

AVKKIKCDMLRYIQESKERAAEMVKAEVL*ERQ 

ASKLATMAKLLETPISSKSQSKTTQSGMSK 

3552 

A 

111 

375 

ARTRQTSGQAREPEKESPAPGGGGLAEIRSRQQL 
SQTSRIPPLAKDQAVEAMFPPARGKELLSFEDVA 
MYFTREEWGHLNWGQKDLYRDVMLENYRNMV 
LLVYFQFDAAIPLC*TSLAHSSWLQLYFRLYF 

3553 

A 

76 

72 

PGVRGVEAPGGVAPGRNAMRRGERRDAGGPRP 

ESPVPAGRASLEEPPDGPSAGQATGPGEGRRSTE 

SEVYDDGTNTFFWRAHTLTVLFILTCTLGYVTLL 

EETPQDTAYNTKRGIVASILVFLCFGVTQAKDGP 

FSRPHPAYWRFWLCVSVVYELFLIFILFQTVQDG 

RQFLKYVDPKLGVPLPERDYGGNCLIYDPDNET 

DPFHN1 WDKLDGF VPAHFLG WYLKTLMIRDWW 

MCMHSVMFEFLEYSLEHQLPNFSECWWDHW1M 

DVLVCNGLG I YCGMKTLE WLSLKTYKWQGL WN 

IPTYKGKMKRJAFQnPYSWVKJFEWKPASSLRR 

tin * \/r , r:iTT vttt t apt mtp vt wmppft-JYT V 

WLA V LAjIIL V r JLL,AilL,£N lri L»JSJr V W ivjurr Juri ilv 

LLRLVFFVNVGGVAMREIYDFMDDPKPHKKLGP 
QAWLVAAITATELLIVVKYDPHTLTLSLPFYISQC 
WTLGSVLALTWTVWRFFLRDITLRYKETRWQK 
WQNKDDQGSTVGNGDQHPLGLDEDLLGPGVAE 
GEG APTPN* PRGP APRPLPS APRA VCG ASSRR 


A 

9 


FDFF^AT P^P^T OTSWSFGPMSRRALRRLRGEOR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVKNRFELINIDDLEDDPWNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRJLER1EDSTGLNRPGPAPLSSRKHVLYVE 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, II=Histidine, 
l=Isolcucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Prolinc, Q=Clutamine, R=Arginine, S=Serine, 
T-Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\-possibie nucleotide insertion 





HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNN1VVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRM 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDTIYSYVRPERLSPISHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA * GEGE WD 

3555 

A 

2 

2106 

FDEFSALPSPSLQTSWSFGPMSRRALRRLRGEQR 

GQEPLGPGALHFDLRDDDDAEEEGPKRELGVRR 

PGGAGKEGVRVNNRFELINIDDLEDDPVVNGERS 

GCALTDAVAPGNKGRGQRGNTESKTDGDDTET 

VPSEQSHASGKLRKKKKKQKNKKSSTGEASENG 

LEDIDRILERIEDSTGLNRPGPAPLSSRKHVLYVE 

HRHLNPDTELKRYFGARAILGEQRPRQRQRVYP 

KCTWLTTPKSTWPRYSKPGLSMRLLESKKGLSFF 

AFEHSEEYQQAQHKFLVAVESMEPNNIVVLLQT 

SPYHVDSLLQLSDACRFQEDQEMARDLVERALY 

SMECAFHPLFSLTSGACRLDYRRPENRSFYLALY 

KQMSFLEKRGCPRTALEYCKLILSLEPDEDPLCM 

LLLIDHLALRARNYEYLIRLFQEWEVGASLAHRN 

LSQLPNFAFSVPLAYFLLSQQTDLPECEQSSARQ 

KASLLIQQALTMFPGVLLPLLESCSVRPDASVSSH 

RFFGPNAEISQPPALSQLVNLYLGRSHFLWKEPA 

TMSWLEENVHEVLQAVDAGDPAVEACENRRKV 

LYQRAPRNIHRHVILSEIKEAVAALPPDVTTQSV 

MGFDPLPPSDH YS Y VRPERLS PI SHGNTIALFFRS 

LLPNYTMEGERPEEGVAGGLNRNQGLNRLMLA 

VRDMMANFHLNDLEAPHEDDA*GEGEWD 

3556 

A " 

3388 

1650 

KTRGTMFYYPNVLQRHTGCFAT1WLAATRGSRL 

VKREYLRVNVVKTCEEILNYVLVRVQPPQPGLP 

RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRIDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 

FRTPTLSGWLPPELLGLWTHCAQPPPKALRRELP 

EEAAAEEERRKIEVPSEIEVPREALEPSVPLMVSL 

EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 

LPWPELPEVPMEMPLVLPPELELLSLEAVHRAV 

ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 

LSAQQILHVKQEKPYGRLLIQPGPRFH 

3557 

A 

3388 

1650 

KTRGTMFYYPNVLQRHTGCFATIWLAATRGSRL 
VKREYLRVNVVKTCEEELNYVLVRVQPPQPGLP 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Iso!eucinc, K=Lysinc, L/=Leucine, M=Metbionine, 
N=Asparagine, P^Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine,\V=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /=possibie nucleotide deletion, 
\=possible nucleotide insertion 





RPRFSLYLSAQLQIGVIRVYSQQCQYLVEDIQHIL 

ERLHRAQLQIRIDMETELPSLLLPNHLAMMETLE 

DAPDPFFGMMSVDPRLPSPFDIPQIRHLLEAAIPE 

RVEEIPPEVPTEPREPERIPVTVLPPEAITILEAEPIR 

MLEIEGERELPEVSRRELDLLIAEEEEAILLEIPRL 

PPPAPAE*GQELLDQVGCQCWEGSPHFSCPFPLR 

VEGMGEALGPEELRLTGWEPGALLMEVTPPEEL 

RLPAPPSPERRPPVPPPPRRRRRRRLLFWDKETQI 

SPEKFQEQLQTRAHCWECPMVQPPERTIRGPAEL 
r?D tdhpt crMn ddct t m w/Tur^ a r\DDDi^ a t duct d 

EEAAAEEERRKJEVPSEIEVPREALEPSVPLMVSL 
EISLEAAEEEKSRISLIPPEERWAWPEVEAPEAPA 
LPVVPELPEVPMEMPLVLPPELELLSLEAVHRAV 
ALELQANREPDFSSLVSPLSPRRMAARVFYLLLV 
LSAQQILHVKQEKPYGRLLIQPGPRFH 

3558 

A 

489 

2360 

IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEIEDFDSLEALRLEGNTVGVEA 

ARVIAKAL* KKSELKRCH WSDMFTGRLRTE1PPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

QGEK.S A 1 PSRKlLUr N 1 ubrArv Loor r V A V V a 1 r 

LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKVVSA 

FLKVSSWKDEATVRMAVQDAVDALMQKAFNS 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIANLYGP 

LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 

SCSFARHSLLQTLYKV 

3559 

A 

489 

2360 

IRPRPRGRRRALDSPNAAAPPVYVCRSPGEPTSL 

VNMASEDIAKLAETLAKTQVAGGQLSFKGKSLK 

LNTAEDAKDVIKEIEDFDSLEALRLEGNTVGVEA 

ARVIAKAL*KKSELKRCHWSDMFTGRLRTE1PPA 

LISLGEGLITAGAQLVELDLSDNAFGPDGVQGFE 

ALLKSSACFTLQELKLNNCGMGIGGGKILAAALT 

ECHRKSSAQGKPLALKVFVAGRNRLENDGATAL 

AEAFRVIGTLEEVHMPQNGINHPGITALAQAFAV 

NPLLRVINLNDNTFTEKGAVAMAETLKTLRQVE 

VINFGDCLVRSKGAVAIADAIRGGLPKLKELNLS 

FCEIKjRDAALAVAEAMADKAELEKLDLNGNTLG 

EEGCEQLQEVLEGFNMAKVLASLSDDEDEEEEE 

EGEEEEEEAEEEEEEDEEEEEEEEEEEEEEPQQRG 

r\f^T?Tr C* ATnPDT/TI TVTJXTT 1 ^ "CT> A Tl\ 7T CCDDD A TYWOTIT 

QGEKSA I PSKJULL>rN 1 ObPAr VLbdrrTAL/ 1 r 
LAFPSPEKLLRLGPKSSVLIAQQTDTSDPEKWSA 

SSFNSNTFLTRLLVHMGLLKSEDKVKAIANLYGP 
LMALNHMVQQDYFPKALAPLLLAFVTKPNSALE 
SCSFARHSLLQTLYKV 

3560 

A 

2 

1198 

FVRELPRPRPGAATAAIMVSVINTVDTSHEDMIH 
DAQMDYYGTRLATCSSDRSVKIFDVRNGGQILIA 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=GJycine, H=Histidine, 
I=Isoleucine, K=Lysinc, L=Leucine, M=Mcthionine, 
N=Asparaginc, P=Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=VaIine, VV^Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 





DLRG HEG P V WQ V A W AHPM YGN1L ASCS YDRKV 

IIWREENGTWEKSHEHAGHDSSVNSVCWAPHDY 

GLILACGSSDGAISLLTYTGEGQWEVKKrNNAHT 

IGCNAVSWAPAVVPGSLIDHPSGQKPNYIKRFAS 

O UuD IN Li JvJL W JsJbtlilJOy W K.bfcv^KLbArlblJWVK 

DVAWAPSIGLPTST1ASCSQDGRVFIWTCDDASS 

NTWSPKLLHKFNDVVWHVSWSITANILAVSGGD 

NKVTLWKESVDGQWVCISDVNKGQGSVSASVT 

EGQQNEQ*QDRWGLAPHPPAPGLPLPGPTNQTT 

vjiVorv^Ly y V Y r rKKo Y KUorlKLlIULN VIUDAJL 

3561 

A 

540 

86 

WRVKEMTSTLPKALGRKTASRSHTTLQGGSCCP 

VLWTAKLRCRKLRFPLPPPPPSSSAWPWQGWGI 

RGEQEAEGPLGETGPPVGPELSGLRQWRKLIKGR 

YGEWRGSGQKTGQPS*TTMQGGETEENRTETTT 

GNKQRESEAPWVRHTYIT 

3562 

A 

1920 

242 

PMMAMPFFERFKSSIQRPSPVLVLSQNTKRESGR 

KVQSGNINAAKTIADIIRTCLGPKSMMKMLLDP 

MGGIVMTNDGNAILREIQVQHPAAKSMIEISRTQ 

DEEVGDGTTSVIILAGEMLSVAEHFLEQQMHPTV 

VISA YRKA LDDMISTLKKISIP VDISDSDMMLNI IN 

SSITTKAISRWSSLACNIALDAVKMVQFEENGRK 

EIDIKK Y ARVEKIPGG IIEDSCVLRG VMINKD VTH . 

PRMRRYIKNPRIVLLDSSLEYKKGESQTDIEITRE 

EDFTR1LQMEEEYIQQLCED1IQLKPDVVITEKGIS 

DLAQHYLMRANITAIRRVRKTDNNR1ARACGARI 

VSRPEELREDDVGTGAGLLE1KKJGDEYFTFITDC 

KDPKACTILLRGASKJEILSEVERNFQDAMQVCRN 

VLLDPQLVPGGGASEMAVAHALTEKSKAMTGV 

EQWPYRAVAQALEVIPRTLIQNCGASTIRLLTSLR 

AKHTQENCETWGVNGETGTLVDMKELGIWEPL 

AVKLQTYKTAVETAVLLLRIDDIVSGHKKKGDD 

QSRQGGAPDAGQE 

3563 

A 

1571 

560 

GPSLLGTRGTPNPARTLQIPFLIIGRRLTGRMAAV 
DDLQFEEFGNAATSLTANPDATTVNIEDPGETPK 
HQPGSPRGSGREEDDELLGNDDSDKTELLAGQK 
KSSPFWTFEYYQTFFDVDTYQVFDR1KGSLLPIPG 

NFLIHLGEKTYHYVPEFRKVSIAATIIYAYAWLVP 
LALWGFLMWRNSKVMN1VSYSFLEIVCVYGYSL 
FIYIPTAILWIIPHK^VRWILVMIALGISGSLLAMT 
r W r A V ivfcD IN KK V A.L.A 1 1 V llVJLl^niVU^oVUUJLA 

YFFDAPEMDHLPTTTATPNQTVAAAKSS 

3564 

A 

1 

328 

NSRVDDFVAHLQRPLLGPASCLGILRPAMTAHSF 
ALPGIIFTTFWGLVGIAGPWFVPKGPNRGVIITML 
VATAVCCYLFWLIAILAQLNPLFGPQLKNETIWY 
VRFLWE 

3565 

A 

2 

1081 

FVTDFPARSMAATSLMSALAARLLQPAHSCSLRL 
RPFHLAAVRNEAVVISGRKLAQQIKQEVRQEVEE 
WVASGNKRPHLSVILVGENPASHSYVLNKTRAA 

A VvrSTN^FTTMlfPA^T^FFFT T "NT nsJK'T lsTlsrnn>JVr> 

GLLVQLPLPEHIDERRICNAVSPDKDVDGFHVIN 
VGRMCLDQYSMLPATPWGVWEIIKRTGIPTLGK 
KVVVAGRSKNVGMPIAMLLHTDGAHERPGGDA 
TVTISHRYTPKEQLKJKIITILADIVISAAGIPNLITA 
DMIKEGAAVIDVGINRVHDPVTAKPKLVGDVDF 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Clutamic Acid, F=Pheny!alanine, G=Glycine, HNHistidine, 
I=Iso!eucine, K=Lysine, L=Lcucine, M=Mcthionine, 
N=Asparagine, P=Proline, Q-GIutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





EG VRQKAG YITP VPGG VGPMTV AMLMKNTIIA A 
KKVLRLEEREVLKSKELGVATN 

3566 

A 

3 

1130 

SCRRGRQQQRRNVSLSSQFAHTMAAPAQQTTQP 
GGGKRKGKAQYVLAKRARRCDAGGPRQLEPGL 
QGILITCNMNERKCVEEAYSLLNEYGDDMYGPE 
KFTDKDQQPSGSEGEDDDAEAALKKEVGDIKAS 
TEMRLRRFQSVESGANNVVFIRTLGIEPEKLVHHI 
LQDMYKTKXKXTRVILRMLPISGTCKj^FLEDMK 
FCYAFTFr FPWFK^PNK^TTOfVYRfSRNN^HVNR 

EEVIRELAGIVCTLNSENKVDLTNPQYTVVVE1IK 

AVCCLSVVKDYMLFRKYNLQEVVKSPKDPSQLN 

SKQGNGKEAKLESADKSDQNNTAEGKNNQQVP 

ENTEELGQTKPTSNPQVVNEGGAKPELASQATE 

GSKSNENDFS 

3567 

A 

248 

3498 

GKKDSSPWTCPFHPPLQLFFVIRNTRQLGDFHLA 

KIKVRNYWTADGDLDIGAKNVKLYVNRNLIFNG 

KLDKGDREAPADHSJLVDQKNEKSEQLEEAMNA 

HSEESKGTHEMAGASGDKELGLGCSPPAETLAD 

AKLSSQGNVSGKRKNSTNCRKDSLSQLEEYLRLS 

AVPTSMGDMPSAPATSPPVKCPPVHEEPSLIQQL 

ENLMGRKICEPPGKTPSWLQPSPTGKDRKQGGR 

KPKPLWLSPEKPLAWKGRLPSDDVIGEGPGETEA 

RDKGLRHEPGWGTSRSVNTKERPQRATTKVHSD 

DSDIFNQPPNRERPASGRRGSRKDAGSSSHGDDQ 

PASREDTWSSRTPSRSRWRSEQEHTLHESWSSLS 

AFDRSHRGRISNTELPGDILDELLQQKSSRHSDLP 

PSKKGEQPGLSRGQDGYSGETDAGGDFKIPVLPY 

GQRLVIDIKSTWGDRHYVGLNGIEIFSSKGEPVQI 

SNIKADPPDINILPAYGKDPRVVTNLIDGVNRTQ 

DDMHVWLAPFTRGRSHSITIDFTHPCHVALIRIW 

NYNKSRIHSFRGVKD1TMLLDTQCIFEGEIAKASG 

TLAGAPEHFGDTILFTTDDDILEA1FYSDEMFDLD 

VGSLDSLQDEEAMRRPSTADGEGDERPFTQAGL 

GADERIPELELPSSSPVPQVTTPEPG1YHGICLQLN 

FTASWGDLHYLGLTGLEVVGKEGQALPIHLHQIS 

ASPRDLNELPEYSDDSRTLDKLIDGTNITMEDEH 

MWLIPFSPGLDHVVTIRLDRAESIAGLRFWNYNK 

SPEDTYRGAKIVHVSLDGLCVSPPEGFLIRKGPG 

NCHFDFAQEILFVDYLRAQLLPQPARRLDMRSLE 

CASMDYEAPLMPCGFIFQFQLLTSWGDPYYIGLT 

GLELYDERGEKIPLSFNNIAAFPDSVNSLFGVGG 

DVRTPDKLIDQVNDTSDGRHMWLAPILPGLVNR 

V Y VIFDLPTTVSMIKLAVN YAKTPHRG VKEFGLL 

VDDLLVYNGILAMVSHLVGGILPTCEPTVPYHTI 

LFTEDRDIRHQEKHTTISNQAEDQDVQMMNENQ 

nTNAKRKQSVVDPALRPKTCISEKETRRRRC 

3568 

A 

50 

1724 

AQGGTLSAASRFCRGGLLGPWLHPASEMAATLD 

LKSKEEKDAELDKRIEALRRKNEALIRRYQEIEE 

DRKKAELEGVAVTAPRKGRSVEKENVAVESEKN 

LGPSRRSPGTPRPPGASKGGRTPPQQGGRAGMG 

RASRSWEGSPGEQPRGGGAGGRGRRGRGRGSPH 

LSGAGDTSISDRKSKEWEERRJR.QNIEKMNEEME 

KIAEYERNQREGVLEPNPVRNFLDDPRRRSGPLE 

ESERDRREESRRHGRNWGGPDFERVRCGLEHER 

QGRRAGLGSAGDMTLSMTGRERSEYLRWKQER 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AJanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I-Isolcucine, K=Lysine, L=Leucinc, M^Methionine, 
N=Asparagine, P=Proline, Q=G!utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possibie nucleotide deletion, 
\=possible nucleotide insertion 





EKIDQERLQRHRKPTGQWRREWDAEKTDGMFK 

DGPVPAHEPSHRYDDQAWARPPKPPTFGEFLSQ 

HKAEASSRRRRKSSRPO AKA APR A YSDUDDR WF 

TKEGAASPAPETPQPTSPETSPKETPMQPPEIPAP 

AHRPPEDEGEENEGEEDEEWEDISEDEEEEEfEVE 

EGDEEEPAQDHQAPEAAPTGIPCSEQAHGVPFSP 

EEPLLEPQAPGTPSSPFSPPSGHQPVSDWGEEVEL 

NSPRTTHLAGALSPGEAWPFESV 

3569 

A 

1 

912 

MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 

GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 

RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 

RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 

OGDIDATFKDT ^TRSVRT VRnKnTnKFKGFPYVF 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 

DKGGFGFRKG GPDDRGFRDDFLGGRGGSRPGDR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 

PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 

VQKEQE 

3570 

A 

1 

912 

MGRVGRAGVQLGRRRTTWAAERTGQAAAGGP 
GRALRGQRPDLRSGGAADSPAAGRGELYCGVLP 
RSPWFLSERRRQMADFDTYDDRAYSSFGGGRGS 
RGSAGGHGSRSQKELPTEPPYTAYVGNLPFNTV 
onnTnArPKrif ^FR^vpf vr ni^ nTn^RKTiPrvvF 

V^kJJL'AlvrVir rvJ->l-rOliVij V ivL V JvL/iVLy 1 Ut\T INAJr v_, I V o 

FDEVDSLKEALTYDGALLGDRSLRVDIAEGRKQ 
DKOGFOFRKGriPDnRGFRnDFI OGRGG^RPfinR 

RTGPPMGSRFRDGPPLRGSNMDFREPTEEERAQR 
PRLQLKPRTVATPLNQVANPNSAIFGGARPREEV 
VQKEQE 

3571 

A 

28 

131 

RHFFGNLCAMRAKWRKKRMRIU.KRKRRKM 
RSK 

3572 

A 

3 

1202 

QSEPHRKVRVDPPVRDRPPPHPPPLLVQRALPGQ 

GQAEGSDGADGAKRRAMAHQTGIHATEELKEFF 

AKARAGSVRLIKWIEDEQLVLGASQEPVGRWD 

QDYDRAVLPLLDAQQPCYLLYRLDSQNAQGFE 

WLFLAWSPDNSPVRLKMLYAATRATVKKEFGG 

GH1KDELFGTVKDDLSFAGYQKHLSSCAAPAPLT 

SAERELOOIRJNEVKTEISVESK1HOTLOGF AFPLO 

PEAQl^LQQLKQKMXTNYIQMKLDLERETIELVH 

TEPTDVAQLPSRVPRDAARYHFFLYKHTHEGDP 

LESVVFIYSMPGYKCSIKERMLYSSCKSRLLDSV 

EQDFHLEIAKKIEIGDGAELTAEFLYDEVHPKQH 

AFKQAFAKPKGPGGKRGHKRLIRGPGENGDDS 

3573 

A 

49 

1869 

PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEV 

EEISLLQPQVEESVLNLGKFHSIVRLVAFCPFASS 

QVALENANAVSEGWHEDLRl^LLETHLPSKKKK 

VLLGVGDPKIGAAIQEELGYNCQTGGVIAEILRG 

Vl^HFI^VKGLTDLSACKAQLGLGHSYSRAKV 

lOT^m^VDNMIIQSISLLD^^ 

WGYHFPELVKIINDNATYCRLAQFIGNRRELNE 

DKIJEKLEELTMIXj^ 

DLIN1ESFSSRVVSLSEYRQSLHTYLRSKMSQVAP 
SLSALIGEAVGARLIAHAGSLTNLAKYPASTVQIL 
GAEKALFRALKTRGNTPKYGLIFHSTFIGRAAAK 
NKGRISRYLANKCSIASRIDCFSEVPTSVFGEKLR 
EQVEERLSFYETGEIPRl<mDVMKEAMVQAEAE 


386 


WO 01/57190 


PCT/USO 1/04098 


SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A =A la nine C=Cysteine, D^Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K= Lysine, L=Le urine, M=Methionine, 
N=Asparaginc, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\-possible nucleotide insertion 





EAAAEITRKLEKQEKKJILKKEKKRLAALALASS 

ENSSSTPEECEETSEKPKKKKKQKPQEVPQENGM 

EDPSISFSKPKKKKSFSKEELMSSDLEETAGSTSIP 

KRKXSTPKEETVNDPEEAGHRSRSKKKRKFSKEE 

PVSSGPEEAVGKSSSKKKKKFHKASQED 

3574 

A 

284 

2032 

CGNERTARLWVQPVVSTMPQASEHRLGRTREPP 

VNIQPRVGSKLPFAPRARSKERRNPASGPNPMLR 

PLPPRPGLPDERLKKLELGRGRTSGPRPRGPLRA 

DHGVPLPGSPPPTVALPLPSRTNLARSKSVSSGDL 

RPMGIALGGHRGTGELGAALSRLALRPEPPTLRR 

STSLRRLGGFPGPPTLFSIRTEPPASHGSFHM1SAR 

SSEPFYSDDKMAHHTLLLGSGHVGLRNLGNTCF 

LNAVLQCLSSTRPLRDFCLRRDFRQEVPGGGRA 

QELTEAFADVIGALWHPDSCEAVNPTRFRAVFQ 

KYVPSFSGYSQQDAQEFLKLLMERLHLEINRRGR 

RAPPILANGPVPSPPRRGGALLEEPELSDDDRANL 

MWKRYLEREDSK1VDLFVGQLKSCLKCQACGY 

KM irbVrCDLoLrlrl^OrAGGKVSLRiJCFNL 

KEEELESENAPVCDRCRQKTRSTKKLTVQRFPRI 

LVLHLNRFSASRGSIKKSSVGVDFPLQRLSLGDF 

ASDKAGSPVYQLYALCNHSGSVHYGHYTALCR 

CQTGWHVYNDSRVSPVSENQVASSEGYVLFYQL 

MQEPPRCL 

3575 

A 

1 

2408 

RELDSLADLPERIKPPYANGLSTSHLRSSSVEDVK 

LIISEGRPT1EVRRCSMPSVICEHTKQFQTISEESN 

QGSLLTVPGDTSPSPKPEVFSNVPERDLSNVSNIH 

SSFATSPTGASNSKYVSADRNLIKNTAPVNTVMD 

SPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDF 

ICPNSNIPDQESSLQSFCNSENKVLKENADFLSLR 

QTELPGNSCAQDPASFMPPQQPCSFPSQSLSDAES 

ISKHMSLSYVANQEPGILQQKNAVQ1ISSALDTD 

NESTKDTENTFVLGDVQKTDAFVPVYSDSTIQEA 

SPNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAF 

SKLTYKSSSGHEVENSTTDTQVISHEKENKLESL 

VLTHLSRCDSDLCEMNAGMPKGNLNEQDPKHC 

PESEKCLLSIEDEESQQSILSSLENHSQQSTQPEM 

HKYGQLVKVELEENAEDDKTENQIPQRMTRNK 

ANTMANQSKQILASCTLLSEKDSESSSPRGR1RLT 

EDDDPQIHHPRBCRKVSRVPQPVQVSPSLLQAKEK 

TQQSLAAIVDSLKLDEIQPYSSERANPYFEYLHIR 

KKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLLD 

U Nr i^oisJCIr 111 r r r oLoDr LJSJbLr KyQxi V V KMJsJL. 

RLQHSIEREKLIVSNEQEVLRVHYRAARTLANQT 

LPFSACTVLLDAEVYNVPLDSQSDDSKTSVRDRF 

NARQFMSWLQDVDDKFDKLKTCLLMRQQHEA 

AALNAVQRLEWQLKLQELDPATYKSISIYEIQEF 

YVPLVDVNDDFELTPI 

3576 

A 

5 

1421 

LRL A WHDG ARWPLGTPRAAATRREA AALPPVT 
LALLCLDGVFLSSAENDFVHRIQEELDRFLLQKQ 
LSKVLLFPPI SSRT RYT THRTA FNFDT T ^SFWfrF 

GWKRRTVICHQDIRVPSSDGLSGPCRAPASCPSR 

YHGPRPISNQGAAAVPRGARAGRWYRGRKPDQ 

PLYVPRVLRRQEEWGLTSTSVLKREAPAGRDPEE 

PGDVGAGDPNSDQGLPVLMTQGTEDLKGPGQR 

CENEPLLDPVGPEPLGPESQSGKGDMVEMATRF 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amnio acid sequence (A=Alaninc C=Cysteine, D-Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G-Glycinc, H=Histidine, 
I=lsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, YV=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possibic nucleotide insertion 





GSTLQLDLEKGKESLLEKRLVAEEEEDEEEVEED 
GPSSCSEDDYSELLOEITDNLTKKEIOIEKTHT DTS 
SFMEELPGEKDLAHVVEIYDFEPALKTEDLLATF 
SEFQEKGFR1QWVDDTHALGIFPCRASAAEALTR 
EFSVLKIRPLTQGTKQSKLKALQRPKLLRLVKER 
PQTNATVARRLVARALGLQHKKKERPAVRGPLP 
P 

3577 

A 

102 

1998 

DTRTPGSLEMGPLQFRDVAIEFSLEEWHCLDTAQ 

RNLYRNVMLENYSNLVFLGIVVSKPDLIAHLEQG 

KKPLTMKRHEMVANPSGPV1CSHFAQDLWPEQN 

IKDSFQKVILRRYEKRGHGNLQLIKRCESVDECK 

VHTGGYNGLNQCSTTTQSKVFQCDKYGKVFHK 

FSNSNRHNIRHTEKKPFKCIECGKAFNQFSTLITH 

KXIHTGEKPYICEECGKAFKYSSALNTHKRIHTG 

EKPYKCDKCDKAFIASSTLSKHEIIHTGKKPYKCE 

ECGKAFNQSSTLTKHKKIHTGEKPYKCEECGKAF 

NQSSTLTKHKKIHTGEKPYVCEECGKAFKYSR1L 

TTHKRIHTGEKPYKCNKCGKAFIASSTLSRHEFIH 

MGKKHYKCEECGKAFIWSSVLTRHKRVHTGEKP 

YKCEECGKAFKYSSTLSSHKRSHTGEKPYKCEEC 

SSSLTKHKKIHTGEKPYKCEECGKAFNQSSSLTK 
HKKIHTGEKPYKCEECGKAFNQSSTLIKHKKIHT 
REKP YKCEECG KAFHLSTHLTTHKILHTGEKPYR 
CRECGKAFNHSATLSSHKKIHSGEKPYECDKCG 
KAFISPSSLSRHEIIHTGEKP 

3578 

A 

1725 

445 

RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKNNIQRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKKJDLLGIIKGMKVELSTVNV 

RTTKPPKRKPLKSLEATLGRLRRATEYAPKKRIEP 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNIISDMKVARSATARV | 

RSRPELRIOFDEGYDNYPGOFKTDDI KKRKNTFT 

GKJILNIFDMMAVTKEAPETDTSPSLWDVEFAKQ 

LATWEQPLQNGFEELIQWTK£GKXWEFPINNEA 

GFDDDGSEFHEmFLEKmESl^KQGPIROTMELV 

TCGLSKNPYLSVKQKVEHIEWFRNYFNEKXDILK 

ESNIQFKLRPWKFLFRNN 

3579 

A 

1725 

445 

RPRRRGTHHFSCVLGSFRVSAMFPRVSTFLPLRP 

LSRHPLSSGSPETSAAAIMLLTVRHGTVRYRSSA 

LLARTKNNIQRYFGTNSVICSKKDKQSVRTEETS 

KETSESQDSEKENTKKDLLGHKGMKVELSTVNV 

RTTKPPKl^LKSLEATLGI^l^^ 

LSPELVAAASAVADSLPFDKQTTKSELLSQLQQH 

EEESRAQRDAKRPKISFSNIISDMKVARSATARV 

RSRPELRIOFDEGYDNYPGOEKTDDLKKRKNTFT 

GKRLNIFDIViMAVTKEAPETDTSPSLWDVEFAKQ 

LATVNEQPLQNGFEELIQWTOEGIO.WEFPIN^A 

GFDDDGSEFHEHIFLEKmESFPKQGPDUlFi^ 

TCGI^KNPYLSVKQKVEHIEWFRNYl^KKDILK 

ESNIQFKLRPWKFLFRNN 

3580 

A 

3673 

1619 

LYCVAPYSRHLLGRMSI^PIVI^ 

LRQRNLKFQGASNLTLSETQNGDVSEETMGSRK 

VKKSKQl^IvnW^ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
seqoence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Gtycine, H-Histidine, 
I=lsoleucine, KHLysine, L=Leucine, M=Methionine, 
N-Asparagine, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





KSPQKSTVLTNGEAAMQSSNSESKXKKKKKRK 

MVNDAEPDTKKAKTENKGKSEEESAETTKETEN 

NVEKPDNDEDESEVPSLPLGLTGAFEDTSFASLC 

NLVNENTLKAIKEMGFTNMTEIQHKSIRPLLEGR 

DLLAAAKTGSGKTLAFLIPAVELIVKLRFMPRNG 

TGVLILSPTRELAMQTFGVLKELMTHHVHTYGLI 

MGGSNRSAEAQKLGNGINIIVATPGRLLDHMQN 

TPGFMYKNLQCLVIDEADRJLDVGFEEELKQIIKL 

LPTRRQTMLFSATQTRKVEDLARISLBCKEPLYVG 

VDDDKANATVDGLEQGYVVCPSEKRFLLLFTFL 

KKNRKKKLMVFFSSCMSVKYHYELLNYIDLPVL 

AIHGKQKQNKRTTTFFQFCNADSGTLLCTDVAA 

KvjL,UJLrr» V u W 1 V Y DrrUur!S±, Y 1HK VCjR I AKGL 

NGRGHALLILRPEELGFLRYLKQSKVPLSEFDFS 

WSK1SDIQSQLEKLIEKNYFLHKSAQEAYKSYIRA 

YDSHSLKQIFNVNNLNLPQVALSFGFKVPPFVDL 

NVNSNEGKQKJCRGGGGGFGYQKTKKVEKSKIF 

ISJ^loJsJVooL'oK.v^r oii 

3581 

A 

23 

453 

LCRCICIKNITPHCLWDKVLSQFTYILDNLSNFMS 

HHPHSLRNSCLIRMDLLYWQFTIYTITFCFSHLSG 

RLTLSAQHISHRPCLLSYSLLFWKVHHLFLEGFPC 

SPRLDEMSFHQFPQHPVHVSVVHLPIVYKGSMT 

QVSPH 

3582 

A 

3 

950 

TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 

GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 

GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 

AEKRDPQELVASFSERVRNMSPDEIKIPPEPPGRC 

bNHLQDKJQKLYERXIKEGMDMNYnQRKKEFRN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQKIEMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTT1AQPTILTTTATLPAVVTVTTSASGSKTTVIS 

AVGTIVKKAKQ 

3583 

A 

3 

950 

TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEA 
GIEAVGSAAEEKGGLVSDAYGEDDFSRLGGDED 
GYEEEEDENSRQSEDDDSETEKPEADDPKDNTE 
AEKRDPQELVASFSERVRNMSPDEIKIPPEPPGRC 
oJNrlL. ^lJlvlV^jSJL Y JiKJKiiviiwiVLUMN Y IH^KJKJsJir KJN 

PSIYEKLIQFCAIDELGTNYPKDMFDPHGWSEDS 

YYEALAKAQK1EMDKLEKAKKERTKIEFVTGTK 

KGTTTNATSTTTTTASTAVADAQKRKSKWDSAI 

PVTTIAQPTILTTTATLPAVVTVTTSASGSKTTVIS 

AVGTIVKKAKQ 

3584 

A 

3 

1139 

PGSTISSRADRLGAPVLAHPKMAERQEEQRGSPP 

LRAEGKADAEVKLILYHWTHSFSSQKVRLVIAE 

KALKCEEHDVSLPLSEHNEPWFMRLNSTGEVPV 

LIHGENIICEATQIIDYLEQTFLDERTPRLMPDKES 

MYYPRVQHYRELLDSLPMDAYTHGCILHPELTV 

DSMIPAYATTRIRSQIGNTESELKKLAEENPDLQE 

AYTAKOKRLKSKLLDHDNVKYT KKXl DFT FKVT 

DQVETELPRRNEETPEEGQQPWLCGESFTLADVS 
LAVTLHRLKFLGFARRNWGNGKRPNLETYYERV 
LKJRXTTOKVLGH\nWILISAVLPTAFRVAXKRAP 
KVLGTTLVVGLLAGVGYFArT^FRKRLGSMILA 
LRPRPNYF 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIaIanine, G=Glycine, H=Histidine, 
l=Isoleuctne, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threoninc, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possib!e nucleotide insertion 

3585 

A 


1777 

RRHSPGSPAFAPSSRATAICPRAARAPATLLLALG 

AVLWPAAGAWELTILHTNDVHSRLEQTSEDSSK 

CVNASRCMGGVARLFTKVQQLRRAEPNVLLLDA 

GDQYQGTIWFTVYKGAEVAHFMNALRYDAMA 

LGNHEFDNGVEGLIEPLLKEAKFPILSANIKAKGP 

LASQISGLYLPYKVLPVGDEWG1VGYTSKETPF 

LSNPGTNLVFEDEITALQPEVDKLKTLNVNKIIAL 

GHSGFEMDKLIAQKVRGVDVVVGGHSNTFLYT 

GNPPSKEVPAGKYPFIVTSDDGRKVPVVQAYAF 

GKYLGYLKIEFDERGNVISSHGNPILLNSSIPEDPS 

IKADINKWRIKLDNYSTQELGKTIVYLDGSSQSC 

RFRECNMGNLICDAM1NNNLRHTDEMFWNHVS 

MClLNUOUiKbrllJbKNNG 1 ITWbNLAAVLPFGG 

TFDLVQLKGSTLKKAFEHSVHRYGQSTGEFLQV 

GGIHVVYDLSRKPGDRVVKLDVLCTKCRVPSYD 

PLKMDEVYKVILPNFLANGGDGFQMIKDELLRH 

DSGDQDINVVSTYISKMKVIYPAVEGRJKFSTGS 

HCHGSFSLIFLSLWAVIFVLYQ 

oOoo 

A 

A 

i J yy 

OOI 

OO 1 

LbNKX) V LbPQLKJJbNSKLRRKLNEVQSFSEAQTE 

MVRTLERKLEAKMrKEESDYHDLESVVQQVEQN 

LELMTKRAVKAENHWKLKQEISLLQAQVSNFQ 

RENEALRCGQGASLTVVKQNADVALQNLRVVM 

NSAQASIEQLVSGAETLNLVAEILKSIDRISEVKD 

EEEDS 

3587 

A 

88 

1639 

GCVGRGLPLPPRHPTPPSSSSSPFVLLAFLLLVRL 

DPAVSGKMAAPRPPPARLSGVMVPAPIQDLEAL 

RALTALFKEQRNRETAPRTJFQRVLDILKKSSHA 

VELACRDPSQVENLASSLQLITECFRCLRNACIEC 

SVNQNSIRNLDTIGVAVDLILLFRELRVEQESLLT 

AFRCGLQFLGNIASRNEDSQSIVWVHAFPELFLS 

CLNHPDKKIVAYSSMILFTSLNHERMKELEENLN 

IAIDV1DAYQKHPESEWPFLIITDLFLKSPELVQA 

MFPKLNNQERVTLLDLMIAKITSDEPLTKDDIPVF 

LRHAELIASTFVDQCKTVLKLASEEPPDDEEALA 

1 LKULJJ V LLcM l VJN I iiLLAj YLl^ VrrObLcKVlUL 

LRVIHVAGKETTNIFSNCGCVRAEGDISNVANGF 

KSHLIRLIGNLCYKNKDNQDKVNELDGIPLILDN 

CNISDSNPFLTQWVIYAIRNLTEDNSQNQDLIAK 

MEEQGLADASLLKKVGFEVEKKGEKLILKSTRD 

TPKP 

3588 

A 

3 

1462 

DSPRNRFEILGRPTRTPTRPGPRPAMEDLDALLSD 

LETTTSHMPRSGAPKERPAEPLTPPPSYGHQPQT 

GSGESSGASGDKDHLYSTVCKPRSPKPAAPAAPP 

FSSSSGVLGTGLCELDRLLQELNATQFNITDEIMS 

QFPSSKVASGEQKEDQSEDKKRPSLPSSPSPGLPK 

ASATSATLELDRLMASLSDFRVQNHLPASGPTQP 

PWSSTNEGSPSPPEPTGKGSLDTMLGLLQSDLSR 

RGVPTQAKGLCGSCNKPIAGQVVTALGRAWHPE 

HFVCGGCSTALGGSSFFEKDGAPFCPECYFERFSP 

RrnFrNOPTRHRTMVTAT nTHWHPFHFrrV^fYiF 

PFGDEGFHEREGRPYCRRDFLQLFAPRCQGCQGP 

DLDNYISALSALWHPDCFVCRECFAPFSGGSFFEH 

EGRPLCENHFHARRGSLCATCGLPVTGRCVSAL 

GRRFHPDHFTCTFCLRPLTKGSFQERAGKPYCQP 

CFLKLFG 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
¥^.~CaiutHTTiic Acid F~Phenv!a!an!R^ C**=C*ivciaf* Xl—Hintirlirtr 
I=Isolcucine, K=Lysine, L=Leucine, M=Methionine 1 
N=Asparaginc, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 

3589 

A 

226 

6793 

SPPKKSRKCNLSFRLISAERWRFFLLILMEMPRKP 

RLTLFVQRRIENIATEREFDPEEFYYLLEAAEGHA 

KEGQGIKTDIPRYIISQLGLNKDPLEEMAHLGNY 

DSGTAETPETDESVSSSNASLKLRRKPRESDFETI 

KLISNGAYGAVYFVRHKESRQRFAMKKINKQNL 

1LRNQIQQAFVERDILTFAENPFVVSMYCSFETRR 

HLCMVMEYYEGGDCATLMKNMGPLPVDMARM 

YFAETVLALEYLHNYGIVHRDLKPDNLLVTSMG 

HIKLTDFGLSKVGLMSMTTNLYEGHIEKDAREFL 

DKQVCGTPEYIAPEVILRQGYGKPVDWWAMGII 

LYEFLVGCVPFFGDTPEELFGQVISDEINWPEKDE 

APPPDAQDLJTLLLRQNPLERLGTGGAYEVKQHR 

FFRSLDWNSLLRQKAEFIPQLESEDDTSYFDTRSE 

KYHHMETEEEDDTNDEDFNVEIRQFSSCSHRFSK 

VFSSIDRJTQNSAEEKEDSVDKTKSTTLPSTETLS 

WSSEYSEMQQLSTSNSSDTESNRHKLSSGLLPKL 

AISTEGEQDEAASCPGDPHEEPGKPALPPEECAQ 

EEPEVTTPASTISSSTLSVGSFSEHLDQINGRSECV 

DSTDNSSKPSSEPASHMARQRLESTEKKKISGKV 

TKSLSASALSLMIPGDMFAVSPLGSPMSPHSLSSD 

PSSSRDSSPSRDSSAASASPHQPIVIHSSGKNYGFT 

IRAIRVYVGDSDIYTVHHIVWNVEEGSPACQAGL 

KAGDLITH1NGEPVHGLVHTEVIELLLKSGNKVSI 

TTTPFENTSIKTGPARRNSYKSRMVRRSKKSKKK 

ESLERRRSLFKKLAKQPSPLLHTSRSFSCLNRSLS 

SGESLPGSPTHSLSPRSPTPSYRSTPDFPSGTNSSQ 

SSSPSSSAPNSPAGSGHIRPSTLHGLAPKLGGQRY 

RSGRRKSAGNIPLSPLARTPSPTPQPTSPQRSPSPL 

LGHSLGNSKIAQAFPSKMHSPPTIVRHIVRPKSAE 

PPRSPLLKRVQSEEKLSPSYGSDKKHLCSRKHSL 

EVTQEEVQREQSQREAPLQSLDENVCDVPPLSRA 

RPVEQGCLKRPVSRKVGRQESVDDLDRDKLKAK 

VVVKKADGFPEKQESHQKFHGPGSDLENFALFK 

LEEREKKVYPKAVERSSTFENKASMQEAPPLGSL 

LKDALHKQASVRASEGAMSDGPVPAEHRQGGG 

DFRRAPAPGTLQDGLCHSLDRGISGKGEGTEKSS 

QAKELLRCEKLDSKLANTOYLRKKMSLEDKEDN 

LCPVLKPKMTAGSHECLPGNPVRPTGGQQEPPPA 

SESRAFVSSTHAAQMSAVSFVPLKALTGRVDSGT 

EKPGLVAPESPVRKSPSEYKLEGRSVSCLEPIEGT 

LDIALLSGPQASKTELPSPESAQSPSPSGDVRASV 

PPVLPSSSGKKNDTTSARELSPSSLKMNKSYLLEP 

WFLPPSRGLQNSPAVSLPDPEFKRDRKGPHPTAR 

SPGTVMESNPQQREGSSPKHQDHTTDPKLLTCLG 

QNLHSPDLARPRCPLPPEASPSREKPGLRESSERG 

PPTARSERSAARADTCREPSMELCFPETAKTSDN 

SKNLLSVGRTHPDFYTQTQAMEKAWAPGGKTN 

HKDGPGEARPPPRDNSSLHSAGIPCEKELGKVRR 

GVEPKPEALLARRSLQPPGIESEKSEKLSSFPSLQ 

KDGAKEPERKEQPLQRHPSSIPPPPLTAKDLSSPA 

ARQHCSSPSHASGREPGAKPSTAEPSSSPQDPPKP 

VAAHSESSSHKPRPGPDPGPPKTKHPDRSLSSQK 

PSVGATKGKEPATQSLGGSSREGKGHSKSGPDVF 

PATPGSQNKASDGIGQGEGGPSVPLHTDRAPLDA 

KPQPTSGGRPLEVLEKPVHLPRPGHPGPSEPADQ 
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SEQID 

NO: 

(Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
.-nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=Glycine, H^Histidine, 
I=Iso!cucinc, K=Lysine, L=Leucine, M=Mcthionine, 
N=Asparaginc, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W^ryptophan, Y=Tyrosine, 
X~Unkno>vn, *-Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





KLSAVGEKQTLSPKHPKPSTVKDCPTLCKQTDN 

RQTDKSPSQPAANTDRRAEGKKCTEALYAPAEG 

DKLEAGLSFVHSENRLKGAERPAAGVGKGFPEA 

RGKGPGPQKPPTEADKPNGMKRSPSATGQSSFRS 

TALPEKSLSCSSSFPETRAGVREASAASSDTSSAK 

AAGGMLELPAPSNRDHRKAQPAGEGRTHMTKS 

DSLPSFRVSTLPLESHHPDPNTMGGASHRDRALS 

VTATVGETKGKDPAPAQPPPARKQNVGRDVTKP 

SPAPNTDRPISLSNEKDFWRQRRGKESLRSSPHK 

KAL 

3590 

A 

3 

935 

RATTRPKNEVQDYVSVEYLSPHMGGTDPFKYSY 

PPLVDDDFQTPLCENGPITSEDETSSKEDIESDGK 

ETLETISNEEQTPLLKKINPTESTSKAEENEKVDS 

KVKAFKKPLSVFKGPLLHISPAEELYFGSTESGEK 

KTLIVLTNVTKNIVAFKVRTTAPEKYRVICPSNSS 

CDPGASVDIVVSPHGGLTVSAQDRFLIMAAEME 

QSSGTGPAELTQFWKEVPRNKVMEHRLRCHTVE 

SSKPNTLTLKDNAFNMSDKTSEDICLQLSRLLES 

NRKLEDQVQRCIWFQQLLLSLTMLLLAFVTSFFY 

LLYS 

3591 

A 

303 

2 

GGSWGPLCPVSPAMSLSDPGLGYHPTCWTLRWP 

PLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCS 

CEAGGSCACAGSCKCKKCKCTSCKKSCCSCCPL 

3592 

A 

1052 

1779 

GKTMMRKMLLAAALSVTAMTAHADYQCSVTP 

RDDVIVSPQTVQVKGENGNLVITPDGNVMYNGK 

OYSLNAAOREOAKDYOAELRSTI PWTDFGAKSR 

VEKAR1ALDKIIVQEMGESSKMRSRLTKLDAQVK 

EQMNRJIETRSDGLTFHYKAIDQVRAEGQQLVNQ 

AMGGILQDSINEMGAKAVLKSGGNPLQNVLGSL 

GGLQSSIQTEWKKQEKDFQQFGKDVCSRVVTLE 

DSRKALVGNLK 

3593 . 

A 

3 

1837 

LSFEKVDIQTDNDLTKEMYEGKENVSFELQRDFS 

QETDFSEASLLEKQQEVHSAGNIKKEKSNTIDGT 

VKDETSPVEECFFSQSSNSYQCHT1TGEQPSGCTG 

LGKSISFDTKLVKHEIINSEERPFKCEELVEPFRCD 

SQLIQHQENNTEEKPYQCSECGKAFSINEKLIWH 

QRLHSGEKPFKCVECGKSFSYSSHYITHQTIHSGE 

KPYQCKMCGKAFSVNGSLSRHQRMTGEKPYQC 

KECGNGFSCSSAYITHQRVHTGEKPYECNDCGK 

AFNGNAKLIQHQR1HTGEKPYECNECGKGFRCSS 

QLRQHQSIHTGEKPYQCKECGKGFNNNTKLIQH 

QRIHTASLAEQLFKASGNHPNWGCCL1TSSPGPS 

VYGPKMNMRGAPNSRLAGGREKRTQDTDFGQC 

SFLPSHSPSCFEPWNVTDYDSSWYRQKQVLSGV 

WSSPLSILKLPRTLIRISMQEMDTPGEMLMTGR 

GSLGPTLTTEAPAAAQPGKQGPPGTGRCLQAPGT 

EPGEQTPEGARELSPLQESSSPGGVKAEEEQRAG 

AEPGTRPSLARSDDNDHEVGALGLQQGKSPGAG 

NPEPEQDCAARAPVRAEAVRRMPPGAEAGSVVL 

DD 

3594 

A 

39 

261 

RAAMMDTSRVQPIKLAIVIKVLGRTGSQGQCTQ 

VRVEFMDDTSRSIIRSVKGPAAREGDVLTLLESERE 

ARRLR 

3595 

A 

973 

68 

GRVGTKHQMADDAGAAGGPGGPGGPGMGNRG 
GFRGGFGSGIRGRGRGRGRGRGRGRGARGGKAE 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D-Aspartic Acid, 
E=G!utamic Acid, F-Phenylalanine, G^GIycine, H-Histidine, 
I=Isolcuctne, K=Lysinc, L^Lcucinc, M-Mcthionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





DKEWMPVTKLGRLVKDMKJKSLEEIYLFSLPIKE 
SEIIDFFLGASLKDEVLKIMPVQKQTRAGQRTRF 

V A FV A Tr T nVWnWVrtl nVfCPQlcTPVATArRfiATTT A 

KLSIVPVRRGYWGNKIGKPHTVPCKVTGRCGSV 
LVRLIPAPRGTGIVSAPVPKKLLMMAGIDDCYTS 
ARGCTATLGNFAKATFDAISKTYSYLTPDLWKE 
TVFTKSPYQEFTDHLVKTHTRVSVQRTQAPAVA 
TT 

3596 

A 

106 

2960 

DERRVGAADMFGRSRSWVGGGHGKTSRNIHSL 

DHLKYLYHVLTKNTTVTEQNRNLLVETIRS1TEIL 

IWGDQNDSSVFDFFLEKNMFVFFLNILRQKSGRY 

VCVQLLQTLNILFENISHETSLYYLLSNNYVNSII 

VHKFDFSDEEIMA Y YI SFLKTLSLKLNNHTVHFF 

YNEHTNDFALYTEAIKFFNHPESMVRIAVRTITL 

NVYKVSLDNQAMLHYIRDKTAVPYFSNLVWFIG 

SHVIELDDCVQTDEEHRNRGKLSDLVAEHLDHL 

HYLNDILIINCEFLNDVLTDHLLNRLFLPLYVYSL 

ENQDKGGERPKISLPVSLYLLSQVFLIIHHAPLVN 

SLAEVILNGDLSEMYAKTEQDIQRSSAKPSIRCFI 

KPTETLERSLEMNKHKGKRRVQKRPNYKNVGEE 

EDEEKGPTEDAQEDAEKAKGTEGGSKGIKTSGES 

EEIEMVIMERSKLSELAASTSVQEQNTTDEEKSA 

AATCSESTQWSRPFLDMVYHALDSPDDDYHALF 

VLCLLYAMSHNKGMDPEKLER1QLPVPNAAEKT 

TYNHPLAERLIR1MNNAAQPDGKIRLATLELSCL 

LLKQQVLMSAGCIMKDVHLACLEGAREESVHLV 

RHFYKGEDIFLDMFEDEYRSMTMKPMNVEYLM 

MDASILLPPTGTPLTGIDFVKRLPCGDVEKTRRAI 

RVFFMLRSLSLQLRGEPETQLPLTREEDLIKTDDV 

LDLNNSDLIACTVITKDGGMVQRSLAVDIYQMS 
t vpptwqpi nwnwvc a rzt t r\TY\/tr*\/Tri\/prYnQ 

L, V HrXJ V oKIAj WO V V JsJr/VOLL^JJiVlV^ V luv JdL/L/o 

RALNITIHKPASSPHSKPFPILQATFIFSDHIRCIIAK 

OPT AKTlRTOARRA/fK'A/fnPTA AT f DT PTOPTTPVT CI 

FGLGSSTSTQHLPFRFYDQGRRGSSDPTVQRSVF 
ASVDKVPGFAVAQCINEHSSPSLSSQSPPSASGSP 
SGSGSTSHCDSGGTSSSSTPSTAQSPAGIGHVTQ 

3597 

A 

427 

277 

GVRRIQHHWAQMHECNVHTYASLFCLFLLHTG 
KLCCLNSHRHFHCIKYSK 


A 

i 


PPPPTTf If ATAMVT FWVT n<2TP>JT PPPT OPNFOT 

MRELDQRTEDKKAEIDILAAEYISTVKTLSPDQR 

VERLQKIQNAYSKCKEYSDDKVQLAMQTYEMV 

DKHIRRLDADLARFEADLKDKMEGSDFESSGGR 

GLKKGRGQKEKRGSRGRGRRTSEEDTPKKKKH 

KGG 

3599 

A 

2 

3907 

KTITALAFSPDGKYLVTGESGHMPAVRVWDVAE 

HSQVAELQEHKYGVACVAFSPSAKY1VSVGYQH 

DMIVNVWAWKKNIVVASNKVSSRVTAVSFSED 

CSYFVTAGNRHIKFWYLDDSKTSKVNATVPLLG 

RSGLLGELRNNLFTDVACGRGKKADSTFCITSSG 

I T PFF9DRRT 7 nifWVPT R WPPVKT><\NJOAPr PP 

SSFITCSSDNTIRLWNTESSGVHGSTLHRNILSSDL 

IK1IYVDGNTQALLDTELPGGDKADASLLDPRVGI 

RSVCVSPNGQHLASGDRMGTLRVHELQSLSEML 

KVEAHDSEILCLEYSKPDTGLKLLASASRDRLIH 

VLDAGREYSLQQTLDEHSSSITAVKFAASDGQVR 
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SEQW 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alnninc OCystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=Plienylalanine, G^GIycine, H=Histidine, 
I=Iso leucine, K~Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan t Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib!e nucleotide deletion, 
V=possible nucleotide insertion 





M1SCGADKSIYFRTAQKSGDGVQFTRTHHVVRK 

TTLYDMDVEPSWKYTAIGCQDRNIRIFNISSGKQ 

KKLFKGSQGEDGTLIKVQTDPSGIY1ATSCSDKNL 

SIFDFSSGECVATMFGHSEIVTGMKFSNDCKHLIS 

VSGDSCIFVWRLSSEMTISMRQRLAELRQRQRGG 

KQQGPSSPQRASGPNRHQAPSMLSPGPALSSDSD 

KEGEDEGTEEELPALPVLAKSTKKALASVPSPAL 

PRSLSHWEMSRAQESVGFLDPAPAANPGPRRRG 

RWVQPGVELSVRSMLDLRQLETLAPSLQDPSQD 

SLAIIPSGPRKHGQEALETSLTSQNEKPPRPQASQ 

PCSYPHIIRLLSQEEGVFAQDLEPAPIEDGIVYPEP 

SDNPTMDTSEFQVQAPARGTLGRVYPGSRSSEK 

HSPDSACSVDYSSSCLSSPEHPTEDSESTEPLSVD 

GISSDLEEPAEGDEEEEEEEGGMGPYGLQEGSPQ 

TPDQEQFLKQHFETLASGAAPGAPVQVPERSESR 

SISSRFLLQVQTRPLREPSPSSSSLALMSRPAQVPQ 

ASGEQPRGNGANPPGAPPEVEPSSGNPSPQQAAS 

VLLPRCRLNPDSSWAPKRVATASPFSGLQKAQS 

VHSLVPQERHEASLQAPSPGALLSREIEAQDGLG 

SLPPADGRPSRPHSYQNPTTSSMAKISRSISVGEN 

LGLVAEPQAHAPIRVSPLSKLALPSRAHLVLDIPK 

PLPDRPTLAAFSPVTKGRAPGEAEKPGFPVGLGK 

AHSTTFRWACI GFGTTPKPRTFrOAHPGP^PPA 

QQLPVSSLFQGPENLQPPPPEKTPNPMECTKPGA 

ALSQDSEPAVSLEQCEQLVAELRGSVRQAVRLY 

HSVAGCKMPSAEQSRIAQLLRDTFSSVRQELEAV 

AGAVLSSPGSSPGAVGAEQTQALLEQYSELLLRA 

VERRMERKL 

3600 

A 

1688 

916 

IPGSTISCSMALCEAAGCGSALLWPRLLLFGDSIT 
QFSFQQGGWGASLADRLVRKCDVLNRGFSGYN | 
TRWAKIILPRl IRKGNSI DIPVAVTIFFGANDSAL 

X 1\ »» xXXVXXX-'-l. IVJjll VlYVJ 1 1 OJL_»J-/ll V i\. V 1. AX X VJ All 1/OA.J-/ 

KDENPKQHIPLEEYAANLKSMVQYLKSVDIPENR 

VILITPTPLCETAWEEOCIIOGCKLNRLNSVVGEY 

ANACLQVAQDCGTDVLDLWTLMQDSQDFSSYL 

SDGLHLSPKGNEFLFSHLWPLIEKKVSSLPLLLPY 

WRDVAEAKPELSLLGDGDH 

3601 

A 

44 

223 

V1TPLIPQLAKCFWTMNRAARNKSEKRYYSEFL 
QIAHLFNYGLSSFLREFIIFLIKLLQ 

3602 

A 

37 

1124 

VPI^ASGKRRLEFRPQDSKACAATPHSPGRJTSR 

TRGSQKVRSVPPRLPWAQASASTDWEGLRGVPG 

PALRRENFLEAAASGRSGRTPTGGVGFRDVGGP 

HFPIFPAAHFLWCNLHTPRRPACNAPWHSPVGEI 

SPPPRESQLRRDPEVHFESPAHPLGFRLLPGRGLP 

ANAVTVETAAMAAPROIPSHIVRLKPSCSTDSSF 

TRTPVPTVSLASRELPVSSWQVTEPSSKNLWEQI 

CKEYEAEQPPFPEGYKVKQEPVITVAPVEEMLFH 

GFSAEHYFPVSHFTMISRTPCPQDKSCTINPKTCS 

Pl^YLETFIFPVLLPGIVxASLLHQAKKEKCFEVVL 

QMTPSGGKACVWGHLPSSSHTI 

3603 

A 

286 

587 

NISNKAEVSSHPSVISHSMDSFGQPRPEDNQSVLR 

RMQKKYWKTKQWIKATGKXEDEHLVASDAEL 

DAKLEVFHSVQETCTELLKIIEKYQLRLNGMKS 

3604 

A 

103 

2440 

QPRRJRVFPAAGRGPGRKCSQWGRQASVSFEDVT 

VDFSKEEWQHLDPAQRRLYWDVTLENYSHLLS 

VGYQIPKSEAAFKLEQGEGPWMLEGEAPHQSCS 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
£=Glutamic Acid, F=Phenylalanine, G=Glycine, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, CHGlutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





GEAIGKMQQQGIPGGIFFHCERFDQPIGEDSLCSI 

LEELWQDNDQLEQRQENQNNLLSHVKVLIKERG 

YEHKNIEKIIHVTTKLVPSIKRLHNCDTILKHTLN 

SHNHNRNSATKNLGKIFGNGNNFPHSPSSTKNEN 

AKTGANSCEHDHYEKHLSHKQAPTHHQKIHPEE 

KLYVCTECVMGFTQKSHLFEHQRIHAGEKSREC 

DKSNKVFPQKPQVDVHPSVYTGEKPYLCTQCGK 

VFTLKSNL1THQKIHTGQKPYKCSECGKAFFQRS 

DLFRHLRIHTGEKPYECSECGKGFSQNSDLSIHQ 

KTHTGEKHYECNECGKAFTRKSALRMHQRIHTG 

EKPYVCADCGKAFIQKSHFNTHQRIHTGEKPYEC 

SDCGKSFTKKSQLHVHQRIHTGEKPYICTECGKV 

FTHRTNLTTHQKTHTGEKPYMCAECGKAFTDQS 

NLIKHQKTHTGEKPYKCNGCGKAFIWKSRLKIH 

QKSHIGERHYECKDCGKAHQKSTLSVHQRIHTG 

brvPY VCPbCGKAr lC^KbHr JLAHJriKJH i GbKP YbCb 

DCGKCFTKKSQLRVHQKIHTGEKPNICAECGKAF 

TDRSNLITHQKIHTREKPYECGDCGKTFTWKSRL 

NIHQKSHTGERHYECSKCGKAFIQKATLSMHQII 

ti 1 UKJvr Y AC i bCt^KAJr 1 IJKoiNLlJKJaylUVlHoUbK 

RYKASD 

3605 

A 

3 

322 

SFRMSGRGKGGKGLGKGGAKRHRKVLRDNIQGI 
TKPAIRRLARRGGVKRISGLIYEETRGVLKVFLEN 
VIRDAVTYTEHAKRKTVTAMDVVYALKRQGRT 
LYGFGG 

3606 

A 

1 

1749 

VPVTAEAKLMGFTQGCVTFEDVAIYFSQEEWGL 

LDEAQRLLYRDVMLENFALITALVCWHGMEDE 

ETPEQSVSVEGVPQVRTPEASPSTQKIQSCDMCV 

PFLTDILHLTDLPGQELYLTGACAVFHQDQKHHS 

AEKPLESDMDKASFVQCCLFHESGMPFTSSEVG 

KDFLAPLGILQPQAIANYEKPNKISKCEEAFHVGI 

SHYKWSQCRRESSHKHTFFHPRVCTGKRLYESS 

KCGKACCCECSLVQLQRVHPGERPYECSECGKS 

FSQTSHLNDHRRIHTGERPYVCGQCGKSFSQRAT 

LIKHHRVHTGERPYECGECGKSFSQSSNLIEHCRI 

HTGERPYECDECGKAFGSKSTLVRHQRTHTGEK 

PYECGECGKLFRQSFSLVVHQRIHTTARPYECGQ 

Luko r o JUtVCObll^riv^JL JrloO AKJ^rn.CUilL/UJvoroV^ 

RTTLNKHHKVHTAERPYVCGECGKAFMFKSKL 

VRHQRTHTGERPFECSECGKFFRQSYTLVEHQKI 

HTHT PPVTlPfif^OT£ < 5FTnK'<?<2T TOHHWHTHFRP 
n i vji_,ivr i Lyv^/^j^s^v^vJivor iv^rvooi-jiv^riv^ v vni ^jurur 

YECGKCGKSFTQHSGLILHRKSHTVERPRDSSKC 

vjJVi i or xvoiNi v 

3607 

A 

92 

331 

AMAGPGPGPGDPDEQYDFLFKLVLVGDASVGKT 
CVVQRFKTGAFSERQGSUGVDFTMKTLEIQGKR 
VKLQIWDTAGQER 

3608 

A 

545 

379 

AIKGYIHLSAPRNRYMHTTASNGRMLFMKVTM 
YMRRGVQIMGWSVRMAFMACFTQ 

3609 

A 

118 

873 

VWMAWQVSLLELEDRLQCPICLEVFKESLMLQC 
r;i4Qvr*K'r;r , T vqt qvtjt ivnfVPr^PMPwnvvnriQ 

vjxlo Y t^ivOCL. V oi-»o i rULJLI liv V xvL^r W^V V Dvjo 

SSLPNVSLAWVIEALRLPGDPEPKVCVHHRNPLS 
LFCEKDQELICGLCGLLGSHQHHPVTPVSTVCSR 
MKEELAALFSELKQEQKKVDEL1AKLVKNRTRIV 
NESDVFSWVIRREFQELRHPVDEEKARCLEG1GG 
HTRGLVASLDMQLEQAQGTRERLAQAECVLEQF 
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SEQ ID 

NO: 

Method 

Predicted 
beginning 
nucleotide 

IUC»1 UUII 

corresponding 
to first amino 
acid residue of 
peptide 
sequence 

Predicted end 

nucleotide 

location 

rnrrMnnnHino 

tUI I C3 JJUHUI llg 

to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F-Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methioninc, 

T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





GNEDHHEFIWKFHSMASR 

3610 

A 

2 

987 

DPRVRPPLLQPPPPLLPRLVILKMAPLDLDKYVEI 

ARLCKYLPENDLKRLCDYVCDLLLEESNVQPVS 

TPVTVCGDIHGQFYDLCELFRTGGQVPDTNYIFM 

GDFVDRGYYSLETFTYLLALKAKWPDRITLLRG 

NHESROITOVYGFYDECOTKYGNANAWRYCTK 

VFDMLTVAALIDEQILCVHGGLSPDIKTLDQIRTI 

ERNQEIPHKGAFCDLVWSDPEDVDTWAISPRGA 

GWLFGAKVTNEFVHINNLKLICRAHQLVHEGYK 

FMFDEKLVTVWSAPNYCYRCGNIASIMVFKDVN 

TREPKLFRAVPDSERVIPPRTTTPYFL 

3611 

A 

245? 

869 

AEKMTAELREAMALAPWGPVKVKKEEEEEENF 

PGQASSQQVHSENIKVWAPVQGLQTGLDGSEEE 

EKGQNISWDMAVVLKATQEAPAASTLGSYSLPG 

TLAKSEILETHGTMNFLGAETKNLQLLVPKTEIC 

EEAEKPLIISERJQKADPQGPELGEACEKGNMLK 

RQRIKREKKDFRQVIVNDCHLPESFKEEENQKCK 

KSGGKYSLNSGAVKNPKTQLGQKPFTCSVCGKG 

FSQSANLVVHQRIHTGEKPFECHECGKAFIQSAN 

LVVHQRIHTGQKPYVCSKCGKAFTQSSNLTVHQ 

KIHSLEKTFKCNECEKAFSYSSQLARHQKVHITE 

DCGKAFTQSANLIVHQRSHTGEICPYECKECGKA 
FSCFSHLIVHQRIHTAEKPYDCSECGKAFSQLSCL 
IVHQRJHSGDLPYVCNECGKAFTCSSYLLIHQRIH 
NGEKPYTCNECGKAFRQRSSLTVHQRTHTGEKP 
YECEKCGAAFISNSHLMRHHRTHLVE 

3612 

A 

318 

2245 

SPMAEAALVNTPQIPMVTEEFVKPSQGHVTFEDI 

A V YFSQEE WGLLDEA QRC L YHD VMLENFSLM A 

SVGCLHGIEAEEAPSEQTLSAQGVSQARTPKLGP 

SIPNAHSCEMCILVMKDILYLSEHQGTLPWQKPY 

TSVASGKWFSFGSNLQQHQNQDSGEKHIRKEESS 

ALLLNSCKIPLSDNLFPCKDVEKDFPTDLGLLQHQ 

TTHSRQEYAHRSRETFQQRRYKCEQVFNEKVHV 

TEHQRVHTGEKAYKRREYGKSLNSKYLFVEHQR 

THNAEKPYVCNICGKSFLHKQTLVGHQQRIHTRE 

RSYVCIECGKSLSSKYSLVEHQRTHNGEKPYVCN 

VCGKSFRHKQTFVGHQQRIHTGERPYVCMECGK 

SF1HSYDRIRHQRVHTGEGAYQCSECGKSF1YKQ 

SLLDHHRIHTGERPYECKECGKAFIHKKRLLEHQ 

RIHTGEKPYVCIICGKSFIRSSDYMRHQRIHTGER 

AYECSDCGKAFISKOTLLKHHKIHTRERPYECSE 

CGKGFYLEVKLLQHQR1HTREQLCECNECGKVF 

SHQKRLLEHQKVHTGEKPCECSECGKCFRHRTS 

LIQHQKVHSGERPYNCTACEKAFIYKNKLVEHQ 

1UHTGEKPYECGKCGKAFNKRYSLVRHQKVH1T 

EEP 

3613 

A 

817 

3345 

NQSHPDSETVTVEGGRRKMKSNQERSNECLPPK 

KREBPATSRSSEEKAPTLPSDNHRVEGTAWLPGN 

PGGRGHGGGRHGPAGTSVELGLQQGIGLHKALS 

TGLDYSPPSAPRSVPVATTLPAAYATPQPGTPVSP 

VQYAHLPHTFQFIGSSQYSGTYASFPSQLIPPTAN 

PVTSAVASAAGATTPSQRSQLEAYSTLLANMGS 

LSQTPGHKAEQQQQQQQQQQQQQQQQQQQQQ 

QQQHQQQQQQQQQQQQQQHLSRAPGLITPGSPP 
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SEQJD 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysfeine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y^Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





PAQQNQYVHISSSPQNTGRTASPPAIPVHLHPHQ 

TMIPHTLTLGPPSQVVMQYADSGSHFVPREATK 

KAESSRLQQAIQAKEVLNGEMEKSRRYGAPSSA 

DLGLGKAGGKSVPHPYESRHVVVHPSPSDYSSR 

DPSGVRASVMVLPNSNTPAADLEVQQATHREAS 

PSTLNDKSGLHLGKPGHRSYALSPHTVIQTTHSA 

SEPLPVGLPATAFYAGTQPPVIGYLSGQQQAITY 

AGSLPQHLVIPGTQPLLIPVGSTDMEASGAAPAIV 

TSSPQFAAVPHTFVTTALPKSENFNPEALVTQAA 

YPAMVQAQIHLPVVQSVASPAAAPPTLPPYFMK 

GSIIQLANGELKKVEDLKTEDFIQSAEISNDLKIDS 

o i v c< iViJC.L>/ori or v/\v lV^r/\ V OJDrlJxr\V^ Vo V C VL V 

EYPFFVFGQGWSSCCPERTSQLFDLPCSKLSVGD 
VCTST TT KNT KNGWfcTK'nnPVriPA WI T T^T-IQfc" A 

DGLAGSRHRYAEQENGINQGSAQMLSENGELKF 
PEKMGLSAAPFLTK1EPSKPAATRKRRWSAPESR 
KLEKSEDEPPLTLPKPSLIPQEVKICIEGRSNVGK 

3614 

A 

3 

114 

FFESRLRCKCCEPRGSWARFGCWRLQPEFKPKQ 
LEG 

3615 

A 

3 

1603 

DAWALTNQFSDSKQHIEVLKESLTAKEQRAAILQ 

TEVDALRLRLEEKETMLNKKTKQIQDMAEEKGT 

QAGEIHDLKDMLDVKERKVNVLQKK1ENLQEQL 

RDKEKQMSSLKERVKSLQADTTNTDTALTTLEE 

ALAEKERT1ERLKEQRDRDEREKQEEIDNYKKDL 

KDLKEKVSLLQGDLSEKEASLLDLKEHASSLASS 

GLKXDSRLKTLEIALEQKKEECLKMESQLKKAH 

EAALEARASPEMSDRIQHLEREITRYKDESSKAQ 

AEVDRLLEILKEVENEKNDKDKKIAELESLTSRQ 

VKDQNKKVANLrCHKEQVEKKKSAQMLEEARRR 

ITAEREMVLAQEESARTNAEKQVEELLMAMEKV 

KQELESMKAKLSSTQQSLAEKETrlLTNLRAERR 

KHLEEVLEMKQEALLAAISEKDANIALLELSSSK 

KKTQEEVAALKREKDRLVQQLKQQTQNRMKLM 

ADNYEDDHFKSSHSNQTNHKPSPDQDEEEGIWA 

3616 

A 

244 

1420 

RRRWRARGGLVPTLAWAEATGAYVPGRDKPDL 

PTWKRNFRSALNRKEGLRLAEDRSKDPHDPHKI 

YEFVNSGVGDFSQPDTSPDTNGGGSTSDTQEDIL 

DELLGNMVLAPLPDPGPPSLAVAPEPCPQPLRSPS 

LDNPTPFPNLGPSENPLKRLLVPGEEWEFEVTAF 

YRGRQVFQQTISCPEGLRLVGSEVGDRTLPGWP 

WRAGQWLWAQRLGHCHTYWAVSEELLPNSGH 

GPDGEVPKDKEGGVFDLGPFIVGSLGPPDLITFTE 

GSGRSPRYALWFCVGESWPQDQPWTKRLVMVK 

WPTCLRALVEMARVGGASSLENTVDLfflSNSHP 

LSLTSDQYKAYLQDLVEGMDFQGPGES 

3617 

A 

852 

304 

RGGT 1 SKMAPVT KA A A AlsTAVHT F9PI OAPTPTV 

RASSTSQPLDQVTGSVWNLGRLNHVAIAVPDLE 

KAAAFYKNILGAQVSEAVPLPEHGVSVVFVNLG 

TSTTKMELLHPLGRDSPIAGFLQKNKAGGMHfflCIE 

VDNINAAVMDLKKKKIRSLSEEVKIGAHGKPVIF 

LHPKDCGGVLVELEQA 

3618 

A | 

3 

5992 

DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 
DDDMEGDEAWRCTLSANMYVDEILVWCASEL 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=p0ssible nucleotide insertion 





NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 

SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 

AEKERJMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSRRER VRQSRMDTDLETMDLDQGGEA L APRQ 

YLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRK 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKTNVALMCMLREIGKHINMDGTINVDDFKIIYI 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQIIVCTPEKWDIITRKGGERTYTQLV 

RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLSATLPNYEDVATFLRVDPAKGLFYFDN 

SFRPVPLEQTYVGITEKKAIKRFQIMNEIVYEK1M 

EHAGKNQVLVFVHSRKETGKTARAIRDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTVHKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 

FQVTELGRIASHYYITNDTVQTYNQLLKPTLSEIE 

LFRVFSLSSEFKNITVREEEKLELQKLLERVPIPVK 

ESIEEPSAKINVLLQAFISQLKLEGFALMADMVY 

VTQSAGRLMRAIFEIVLNRGWAQLTDKTLNLCK 

MIDKRMWQSMCPLRQFRKLPEEVVKK1EKKNFP 

FERLYDLNHNEIGELIRMPKMGKTIHKYVHLFPK 

LELSVHLQPITRSTLKVELTITPDFQWDEKVHGSS 

EAFWILVEDVDSEVILHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRVVSDRWLSCETQLPVSFR 

HLELPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

TICAEFAILRMLLQNSEGRCVYITPMRLWQEQVY 

MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

NIUSTPEKWDILSRRWKQRKNVQNINLFVVDEV 

HLIGGENGPVLEVICSRMRYISSQIERPIRIVALSSS 

LSNAKDVAHWLGCSATSTFNFHPNVRPVPLELHI 

QGFNISHTQTRLLSMAKPVFRAITKHSPKKPVIVF 

VPSRKQTRLTAIDILTTCAADIQRQRFLHCTEKDL 

IPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQVVVASRSLCWGMNVAAHLVnM 

DTLYYNGKIHAYVDYPIYDVLQMVGHANRPLQ 

DDEGRC VIMCQG SKKDFFKKFL YEPLPVESHLD 

HCMHDHFNAEIVTKTIENKQDAVDYLTWTFLYR 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMDVAPLNLGMLAAYYYINYTTIEL 

FSMSLNAKTKVRGLIEIISNAAEYEN1PIRHHEDN 

LLROLAOKVPHKLNNPKJ^DPHVKTNLLLOAHL 

SRMQLSAELQSDTEEILSKAIRJLIQACVDVLSSNG 

WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 

PSGLFKRCTDKGVESVFDIMEMEDEERNALLQLT 

DSQIADVARFCNRYPNIELSYEVVDKDSIRSGGP 

VVVLVQLEREEEVTGPVIAPLFPQKREEGWWW 
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SEQ ID 
NO: 

Mefhod 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidtne, 
I^Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 





IGDAKSNSLISIKRLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVICEAETDS 
DSD 

3619 

A 

3 

5992 

DNIDETYGVNVQFESDEEEGDEDVYGEVREEAS 

DDDMEGDEAVVRCTLSANMYVDEILVWCASEL 

NIPEFFPLESPHKKVGYGLSSRTWLQGGGKVIEA 

GRDLLVASGELMSSKKKDLHPRDIDAFWLQRQL 

SRFYDDAIVSQKKADEVLEILKTASDDRECENQL 

VLLLGFNTFDFIKVLRQHRMMILYCTLLASAQSE 

AEKERIMGKMEADPELSKFLYQLHETEKEDLIRE 

ERSRRERVRQSRMDTDLETMDLDQGGEALAPRQ 

VLDLEDLVFTQGSHFMANKRCQLPDGSFRRQRX 

GYEEVHVPALKPKPFGSEEQLLPVEKLPKYAQA 

GFEGFKTLNRIQSKLYRAALETDENLLLCAPTGA 

GKThTVALMClVlLREIGKHINMDGTINVDDFKIIYI 

APMRSLVQEMVGSFGKRLATYGITVAELTGDHQ 

LCKEEISATQI1VCTPEKWDIITRKGGERTYTQLV 

RLIILDEIHLLHDDRGPVLEALVARAIRNIEMTQE 

DVRLIGLS ATLPN YED V ATFLRVDPAKGLFYFDN 

SFEO'VPLEQTYVGITEKKAIKRFQIMNEIVYEKIM 

EHAGKNQVLVFVHSRJKETGKTARAIRDMCLEKD 

TLGLFLREGSASTEVLRTEAEQCKNLELKDLLPY 

GFAIHHAGMTRVDRTLVEDLFGDKHIQVLVSTA 

TLAWGVNLPAHTV1IKGTQVYSPEKGRWTELGA 

LDILQMLGRAGRPQYDTKGEGILITSHGELQYYL 

SLLNQQLPIESQMVSKLPDMLNAEIVLGNVQNA 

KDAVNWLGYAYLYIRMLRSPTLYGISHDDLKGD 

PLLDQRRLDLVHTAALMLDKNNLVKYDKKTGN 

FQVTELGR1ASHYYITNDTVQTYNQLLKPTLSEIE 

LFRVFSLSSEFKNITVREEEKLELQKLLERVPIPVK 

ESIEEPSAKINVLLQAFISQLKLEGFALMADMVY 

VTQSAGRLMRAIFEIVLNRGWAQLTDKTLNLCK 

MIDKRMWQSMCPLRQFRKLPEEVVKKIEKKNFP 

FERLYDLNHNEIGELIRMPKMGKTIHKYVHLFPK 

LELSVHLQP1TRSTLKVELTITPDFQWDEKVHGSS 

EAFWILVEDVDSEVILHHEYFLLKAKYAQDEHLI 

TFFVPVFEPLPPQYFIRWSDRWLSCETQLPVSFR 

HLDLPEKYPPPTELLDLQPLPVSALRNSAFESLYQ 

DKFPFFNPIQTQVFNTVYNSDDNVFVGAPTGSGK 

TICAEFAILRMLLQNSEGRCVYITPMRLWQEQVY 

MDWYEKFQDRLNKKVVLLTGETSTDLKLLGKG 

NIIISTPEKWDILSRRWKQRKNVQNINLFVVDEV 

HLIGGENGPVLEVICSRMRYISSQIERPIRIVALSSS 

LSNAKDVAHWLGCSATSTFNFHPNYRPVPLELHI 

QGFNISHTQTRLLSMAKPVFHAITKHSPKKPVIVF 

VPSRKQTRLTAJDILTTCAADIQRQRFLHCTEKDL 

BPYLEKLSDSTLKETLLNGVGYLHEGLSPMERRL 

VEQLFSSGAIQVWASRSLCWGMNVAAHLVIIM 

DTLYYNGK1HAYVDYPIYDVLQMVGHANRPLQ 

DDEGRCVIMCQGSKKDFFKKFLYEPLPVESHLD 

HCMHDm^AEIVTKTIENKQDAVDYLTWTFLYR 

RMTQNPNYYNLQGISHRHLSDHLSELVEQTLSDL 

EQSKCISIEDEMDVAPLNLGMIAAYYYINYTTIEL 

FSMSLNAKTKVRGLIEIISNAAEYENIPIRHHEDN 

LLRQLAQKWHKLNNPKTNDPHVKTNLLLQAHL 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E>=G!utamic Acid, F=Phenylalanine, G^GIycine, H=Histidinc, 
I=IsoIeucine, K-Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





SRMQLSAELQSDTEEILSKAIRLIQACVDVLSSNG 
WLSPALAAMELAQMVTQAMWSEDSYLRRLPPF 
PSGLFKJICTDKGVESVFDTMFMFDFFRMAT iniT 

DSQIADVARFCNRYPNBELSYEVVDKDS1RSGGP 
VVVLVQLEREEEVTGPVIAPLFPQKREEGWWVV 
IGDAKSNSLISIKRLTLQQKAKVKLDFVAPATGG 
RHNTLYFMSDAYMGCDQEYKFSVDVKEAETDS 
DSD 

3620 

A 

1205 

323 

VIKMALAARLLPQFLHSRSLPCGAVRLRTPAVAE 
VRLPSATLCYFCRCRLGLGAALFPRSARALAASA 
LPAQGSRWPVLSSPGLPAAFASFPACPQRSYSTE 

FKPOOHOKTKMIVT GFWPINWVRTRIK A PT TWA 

YFDKEFSITEFSEGAKQAFAHVSKLLSQCKFDLL 

EELVAKEVLHALKEKVTSLPDNHKNALAANIDEI 

VrTSTGDISIYYDEKGRKFVNILMCFWYLTSANIP 

SETLRGASVFQVKLGNQNVETKQLLSASYEFQR 

EFTQGVKPDWTIARJEHSKLLE 

3621 

A 

2 

2995 

SSSRSRHSSISPVRLPLNSSLGAELSRKKKERAAA 

AAAAKMDGKESSYERSGSYSGRSPSPYGRRRSSS 

PFLSKRSLSRSPLPSRKSMKSRSRSPAYSRHSSSH 

SKKKRSSSRSRHSSISPVRLPLNSSLGAELSRKKK 

ERAAAAAAAKMDGKESSYERSGSYSGRSPSPYG 

RRRSSSPFLSKRSLSRSPLPSRKSMKSRSRSPAYS 

RHSSSHSKKKRSSSRSRHSSISPVRLPLNSSLGAEL 

SRKKKERAAAAAAAKMDGKESKGSPVFLPRKE 

NSSVEAKDSGLESKKLPRSVKLEKSAPDTELVNV 

THLNTEVKNSSDTGKVKLDENSEKHLVKDLKAQ 

GTRDSKPIALKEEIVTPKETETSEKETPPPLPTIASP 

PPPLPnTPPPQTPPLPPLPPlPALPQQPPLPPSQPA 

FSQVPASSTSTLPPSTHSKTSAVSSQANSQPPVQV 

SVKTQVSVTAAIPHLKTSTLPPLPLPPLLPGDDDM 

DSPKETLPSKPVKKEKEQRTRHLLTDLPLPPELPG 

GDLSPPDSPEPKAITPPQQPYKKRPKICCPRYGER 

RQTESDWGKRCVDKFDIIGIIGEGTYGQVYKAKD 

KDTGELVALKKVRLDNEKEGFPITAIREIKILRQL 

IHRSVVNMKEIVTDKQDALDFKKDKGAFYLVFE 

YMDHDLMGLLESGLVHFSEDHIKSFMKQLMEGL 

EYCHKKNFLHRDIKCSNDLLNNSGQIKLADFGLA 

RLYNSEESRPYTNKVITLWYRPPKLLLGEERYTP 

AIDVWSCGCILGELFTKKPIFQANLELAQLELISR 

LCGSPCPAVWPDVKLPYFNTMKPKKQYRRRLR 

EEFSFIPSAALDLLDHMLTLDPSKRCTAEQTLQSD 

FLKDVELSKMAPPDLPHWQDCHELWSKKRRRQ 

RQSGVVVEEPPPSKTSRKETTSGTSTEPVKNSSPA 

PPOPAPGKVF^GAODAIfiT AniTOOT "M09PT AVT 

LNLLQSQTDLSIPQMAQLLNIHSNPEMQQQLEAL 

NQSISALTEATSQQQDSETMAPEESLKEAPSAPVI 

LPSAEQTTLEASSTPADMQNILAVLLSQLMKTQE 

PAGSLEENNSDKNSGPOGPRRTPTMPOEE A AGR S 

NGGNAL 

3622 

A 

16 

390 

TPERGSAYPETAAVRRPAGECPITMSDLEAKLST - 
EHLGDKIK^EDlXLRVlGQDSSEniPKVKMTTPLK 
KLKKSYCQRQGVPVNSLRFLFEGQRIADNHTPEE 
LGMEEEDVIEVYQEQIGGHSTV 

3623 

A 

2 

1544 

PPPAPGPDGLNEGCLrlRLSN4PHQRPRTCAMNPE 
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SEQO) 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phcnylalaninc, G=Glycine, II=Histidine, 
Msoleucinc, K^Lysine, I>=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 





LTMESLGTLHGARGGGSGGGGGGGGGGGGGGP 

GHEQELLASPSPHHARRGPRGSLRGPPPPPTAHQ 

ELGTAAAAAAAASRSAMVTSMASILDGGDYRPE 

LSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQP 

LPPISTVSDKFHHPHPHHHPHHHHHHHHQRLSGN 

VSGSrnTLMRDERGLPAMNNLYSPYKEMPGMSQS 

LSPLAATPLGNGLGGLHNAQQSLPNYGPPGHDK 

MLSPNFDAHHTAMLTRGEQHLSRGLGTPPAAM 

MSHLNGLHHPGHTQSHGPVLAPSRERPPSSSSGS 

A^/ATCPAT t?ri\TTI/ t?\ / A AD IT A CI 1/DVCIDAATCA 

QVATSGQLEE1N 1 KiiVAQRl 1 AbLKK Y MrQAlr A 

QRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRR 

MWKWLQEPEFQRMSALRLAACKRKEQEPNKDR 

NNSQKKSRLVFTDLQRRTLFAIFKENKRPSICEMQ 

ITISQQLGLELTTVSNFF1V1NARRJISLEKWQDDLS 

TGG SSSTSSTCTKA 

3624 

A 

27 

2152 

SARKAEAATSGTAARDGSVGRNLVPPPSASAPK 

AEVESNEKDNRPEEEEQVIHEDDERPSEKNEFSR 

RKRSKSEDMDNVQSKRRRYMEEEYEAEFQVKIT 

AKGDINQKLQKVIQWLLEEKLCALQCAVFDKTL 

AELKTRVEKIECNKRHKTVLTELQAKIARLTKRF 

EAAKEDLKKRHEHPPNPPVSPGKTVNDVNSNNN 

MSYRNAGTVRQMLESKRNVSESAPPSFQTPVNT 

VSSTNLVTPPAVVSSQPKLQTPVTSGSLTATSVLP 

APNTATVVATTQVPSGNPQPTISLQPLPVILHVPV 

AVSSQPQLLQSHPGTLVTNQPSGNVEFISVQSPPT 

VSGLTKNPVSLPSLPNPTKPNNVPSVPSPSIQRNP 

TASAAPLGTTLAVQAVPTAHSIVQATRTSLPTVG 

PSGLYSPSTNRGPIQMKJPISAFSTSSAAEQNSNTT 

PRIENQTNKTIDASVSKKAADSTSQCGKATGSDS 

SGVIDLTMDDEESGASQDPKKLNHTPVSTMSSSQ 

PVSRPLQPIQPAPPLQPSGVPTSGPSQTTIHLLPTA 

PTTVNVTHRPVTQVTTRLPVPRAPANHQVVYTT 

LPAPPAQAPLRGTVMQAPAVRQVNPQNSVTVRV 

PQTTTYVVNNGLTLGSTGPQLTVHHRPPQVHTEP 

PRPVHPAPLPEAPQPQRLPPEAGSTSRPSEATLEV 

SHAFRVKMAIVLVMECPGGGSKLCHC 

3625 

A 

210 

1115 

ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQIT 

LQGSRRRQGRTAFPASGKKRETDYSDGDPLDVH 

KRLPSSTGEDRAVMLGFAMMGFSVLMFFLLGTT 

ILKPFMLSIQREESTCTAIHTDIMDDWLDCAr ICG 

VHCHGQGKYPCLQVFVNLSHPGQKALLHYNEE 

AVQINPKCFYTPKCHQDRNDLLNSALDIKEFFDH 

KNGTPFSCFYSPASQSEDVILIKKYDQMAIFHCLF 

WPSLTLLGGALIVGMVRLTQHLSLLCEKYSTVV 

RDEVGGKVPY1EQHQFKLCIMRRSKGRAEKS 

3626 

A 

9 

921 

SSVVEFSALSVSMACLSPSQLQKFQQDGFLVLEG 
FLSAEECVAMQQRIGEIVAEMDVPLHCRTEFSTQ 
EEEQLRAQGSTDYFLSSGDKIRFFFEKGVFDEKG 

\ttiI \7T4T»r , T/rvTVTT/TATT A I TT A TTTVm /FT/PTTTTCTTVl /AT 

NFLVPPEKSINKIGRALrlAHDPVFKS 

JLAKoJLOLv^lVlJr V V Vv^oJYL I lrl^V^"fLrvJvjl-' "or JrlV^l-/ 

ASFLYTEPLGRVLGVWIAVEDATLENGCLWFIPG 
SHTSGVSRRMVRAPVGSAPGTSFLGSEPARDNSL 
FVPTPVQRGALVLIHGEWHKSKQNLSDRSRQA 
YTFHLMEASGTTWSPENWLQPTAELPFPQLYT 

3627 

A 

231 

644 

INSSPRTGRDHQELNLHTERDSRSQRAVLKIPRQ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phenyl alanine, G^Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possiblc nucleotide insertion 





NPG1FYWIFLPSRSHSASHGSRQRQVSCQGTQDEI 
LKMRNTFAELKNSLEALSSRMDQAEERIGTQAG 
VQWRDHGSLQPQPPEFKQCFHLSLPSSWDYRAC 
LS 

3628 

A 

2 

810 

GCKHLLQNSWYDPRVREADRVGQRARRPRAAM 

D WLMG KS KA KPNG KKP A A EERKA YLEPEHTKA 

RITDFQFKELVVLPREIDLNEWLASNTTTFFHHIN 

LQYSTISEFCTGETCQTMAVCNTQYYWYDERGK 

KVKCTAPQYVDFVMSSVQKLVTDEDVFPTKYG 

REFPSSFESLVRKICRHLFHVLAHIYWAHFKETLA 

LELHGHLNTLYVHFILFAREFNLLDPKJBTAIMDD 

LTEVLCSGGRRGSTVGAVGMGPAAGAPGAQNH 

VKER 

3629 

A 

699 

1604 

CSHGSSAVSAWSPLFQASEVERQLSMQVHALRE 

DFREKNSSTNQHIIRLESLQAEIKMLSDRKRELEH 

RLSATLEENDLLQGTVEELQDRVLILERQGHDKD 

LQLHQSQLELQEVRLSCRQLQVKVEELTEERSLQ 

SSAATSTSLLSEIEQSMEAEELEQEREQLTLLSVE 

MTALKEERDRLRVTSEDKEPKEQLQKAIRDRDE 

AIAKKNAVELELAKCRMDMMSLNSQLLDAIQQ 

KLNLSQQLEAWQDDMHRVIDRQLMDTHLKERS 

QPAAALCRGHSAGRGDEPSIAEGKRLFSFFRKI 

3630 

A 

423 

I 

PAKVLTLDIYLSKTEGAQVDEPVV1TPRAEDCGD 

WDDMEKRSSGRRSGRRRGSQKSTDSPGADAELP 

ESAARDDAVFDDEVAPNAASDNASAEKKVKSPR 

AALDGGVASAASPESKPSPGTKGQLRGESDRSK 

QPPPASSP 

3631 

A 

2082 

674 

WSGFWQLPGVRGVGSAPGGDGAEFTSRRGSSRR 

PGAACPGCRGAGSERAPGGMGRRRAPELYRAPF 

PLYALQVDPSTGLLIAAGGGGAAKTGIKNGVHF 

LQLELINGRLSASLLHSHDTETRATMNLALAGDI 

LAAGQDAHCQLLRFQAHQQQGNKAEBCAGSKEQ 

GPRQRKGAAPAEKKCGAETQHEGLELRVENLQA 

VQTDFSSDPLQKVVCFNHDNTLLATGGTDGYVR 

VWKVPSLEKVLEFKAHEGEIEDLALGPDGKLVT 

VGRDLKASVWQKDQLVTQLHWQENGPTFSSTP 

YRYQACRFGQVPDQPAGLRLFTVQ1PHKRLRQPP 

PCYLTAWDGSNFLPLRTKSCGHEWSCLDVSES 

GTFLGLGTVTGSVAIYIAFSLQCLYYVREAHGIV 

VTDVAFLPEKGRGPELLGSHETALFSVAVDSRCQ 

LHLLPSRRSVPVWLLLLLCVGLUVTILLLQSAFPG 

FL 

3632 

A 

942 

40 

PWCQRVEVRSCGSSKRSCSRWSGSSWDGSRSLG 

RGLNHTSLNRSPPFTPDTMTHCCSPCCQPTCCRT 

TCCRTTCWKPTTVTTCSSTPCCQPSCCVPSCCQP 

CCHPTCCQNTCCRTTCCQPTCVASCCQPSCCSTP 

CCQPTCCGSSCCGQTSCGSSCCQPICGSSCCQPCC 

HPTCYQTICFRTTCCQPTCCQPTCCRNTSCQPTCC 

GSSCCQPCCHPTCCQTICRSTCCQPSCVTRCCSTP 

L/Lv^r 1 v-UuooLC/Oy 1 LIN Hoo I LLr Lt.]\Jr 1 LLy J 1 

CYRTTCCRPSCCCSPCCVSSCCQPSCC 

3633 

A 

605 

3004 

GPEGYRGRRARHPSLGSTTGHCGGGRGAEGTGT 
DPAAPAARLNVDGLLVYFPYDYIYPEQFSYMRE 
LKRTLDAKGHGVLEMPSGTGKTVSLLAL1MAYQ 
RAYPLEVTKLiYCSRTVPEIEKVIEELRKLLNFYE 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine ? H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glu famine, R=Argininc, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y— Tyrosine, 
X=l)nknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





KQEGEKLPFLGLALSSRKNLCIHPEVTPLRFGKD 

VDGKCHSLTASYVRAQYQHDTSLPHCRFYEEFD 

AHGREVPLPAGIYNLDDLKALGRRQGWCPYFLA 

RYSILHANVVVYSYHYLLDPKIADLVSKELARK 

AVVVFDEAHNIDNVCIDSMSVNLTRRTLDRCQG 

NLETLQKTVLR1KETDEQRLRDEYRRLVEGLREA 

SAARETDAHLANPVLPDEVLQEAVPGSIRTAEHF 

LGFLRRLLEYVKWRLRVQHWQESPPAFLSGLA 

QRVCIQRKPLRFCAERLRSLLHTLEITDLADFSPL 

TLLANFATLVSTYAKGFTIIIEPFDDRTPT1ANPIL 

HFSCMDASLAIKPVFERFQSVIITSGTLSPLDIYPK 

ILDFHPVTMATFTMTLARVCLCPMIIGRGNDQVA 

ISSKFETREDIAVIRNYGNLLLEMSAVVPDGIVAF 

FTSYQYMESTVASWYEQGILENIQRNKLLFBETQ 

UuAb 1 oVAJ^bKY^bACfcNUROAlLLoVARGKVS 

EGIDFVHHYGRAVIMFGVPYVYTQSRILKARLEY 

LRDQFQ1RENDFLTFDAMRHAAQCVGRAIRGKT 

U i UbMVr AUJtsJvr AKOi^isJKOJ^jjrKWlv^briJL/lDA 

NLNLTVDEGVQVAKYFLRQMAQPFHREDQLGL 

SLLSLEQLESEETLKRIEQIAQQL 

3634 

A 

159 

384 

LKMSSKTASTNNIAQARRTVQQLRLEASIERIKV 
ojSJKoAuLMd YCbbriAKbDPLLKjIr 1 oENPrKDKK 
TCIIL 

3635 

A 

5 

409 

TELSQLEKAHPPADMGRRKS KRKPPPKKKMTGT 
bbll^r lOrri^NHbJvoCDVrU^ 1 CjVlbC 1 V 
CLEEFQTPITCILGNLGFFQRVGRGLESGPCSSGP 
LCALVQGQSRPEEQVPPSDFCGVRRCRAGFQCQ 

3636 

A 

48 

282 

DHLKSCYQDSHEDPTKMKRFLFLLLTISLLVMVQ 
IQTGLSGQNDTSQTSSPSASSSMSGGIFLFFVANA1 
IHLFCFS 

3637 

A 

1 

1248 

ARAGSVVGSAAARGPPAGCRCERAARLPSSPAR 

RRRCDWVEDGAGRMEILMTVSKFAS1CTMGAN 

ASALEBCEIGPEQFPVNEHYFGLVNFGNTCYCNSV 

LQALYFCRPFREKGLAYKSQPRKKESLLTCLADL 

FHSIATQKKKVGVIPPKKFITRLRKEhJELFDNYM 

QQDAHEFLNYLLNTIADILQEERKQEKQNGRLPN 

GNIDNENNNSTPDPTWVHEIFQGTLTNETRCLTC 

JillooJsxiJciJJrJjlJlw.0 y\J VbyN 1 ol lrlULKUroiN Ibl 

LCSEYKYYCEECRSKQEAHKRMKVKKLPMILAL 

HLKRFKYMDQLHRYTKLSYRVVFPLELRLFNTS 

GDATNPDRMYDLVAVVVHCGSGPNRGHYIAIV 

KSHDFWLLFDDDIVEKIDAQAIEEFYGLTSDISKN 

SESGYILFYQSRD 

3638 

A 

11 

630 

PAGIPVSTISSDRRASTDLTRKMKPDETPMFDPNL 

T ^"FVnW^nMTATPQP A TQPTHPni7r:T VI "PPT PTA 
LlsJi v \J W ov^IN 1 A 1 forAIor 1 or vjbOJL VLrlvr bL, 1 A 

DLNRGFFKVLGQLTETGVVSPEQFMKSFEHMKK 

SGDYYVTVVEDVTLGQIVATATLIIEHKFIHSCAK 

RGRVEDWVSDECRGKQLGNLLLSTLTLLSKKL 

NCYKITLECLPQNVGFYKKFGYTVSEENYMCRR 

FLK 

3639 

A 

2 

1200 

PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHSSPL 

LPHAMKSPFYRCQNTTSVEKGNSAVMGGVLFST 

GLLGNLLALGLLARSGLGWCSRRPLRPLPSVFY 

MLVCGLTVTDLLGKCLLSPVVLAAYAQNRSLRV 

LAPALDNSLCQAFAFFMSFFGLSSTLQLLAMALE 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acjd sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion 





C WLS LG HPFF Y RRHITLRLG AL V APW S AFSL A F 
CAI PFMOFGRTFVOYPPGTWPFTnMVMFPnQT QV 

LG YS VL Y SSLMA LL V LATVLCNLG AMRNL Y AM 

HRRLQRHPRSCTRDCAEPRADGREASPQPLEELD 

HLLLLALMTVLFTMCSLPVIYRAYYGAFKDVKE 

KNRTSEEAEDLRALRFLSViSIVDPWIFIIFRSPVFR 

IFFHKIFIRPLRYRSRCSNSTNMESSL 

3640 

A 

930 

182 

PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEY 
AIEAIKLGSTAIGIQTSEGVCLAVEKRITSPLMEPS 

SIFKTVFTDAHTOrAM^nT TADAVTI IMARVPTH 

oiJL.rvi v L-ll—JrYi liVJ V^/A.lvlOvJJ^l/^J^'/A.rv 1 LlUivriiV V Cly 

NHWFTYNETMTVESVTQAVSNLALQFGEEDADP 

GAMSRPFGVALLFGGVDEKGPQLFHMDPSGTFV 

QCDARAIGSASEGAQSSLQEVYHKSMTLKEAIKS 

SLIILKQVMEEKLNATNIELATVQPGQNFHMFTK 

EELEEVIKDI 

3641 

A 

2 

1254 

PTGQGGRRAEARSCLLSKAMLGRSGYRALPLGD 

FDRFQQSSFGFLGSQKGCLSPERGGVGTGADVPQ 

SWPSCLCHGLISFLGFLLLLVTFPISGWFALK1VPT 

YERM1VFRLGR1RTPQGPGMVLLLPFIDSFQRVDL 

RTRA FN VPPCKL ASKDG A VLS VG AD VQFR1 WDP 

VLSVMTVKJDLNTATRMTAQNAMTICALLKRPLR 

EIQMEKLKISDQLLLEINDVTRAWGLEVDRVELA 

NSMAGGAPSPGPADTVEMVSEVEPPAPQVGARS 
SPKQPLAEGLLTALQPFLSEALVSQVGACYQFNV 

VEMAEADLRALLCRELRPLGAYMSGRLKVKGD 
LAMAMKLEAVLRALK 

3642 

A 

1 

237 

RRGEIDMATEGDVELELETETSGPERPPEKPRKH 

DSGAADLERVTDYAEEKEIQSSNLETAMSVIGDR 

RSREOKAKOFR 

3643 

A 

94 

541 

RKERRRRRRRMEAVVFVFSLLDCCALIFLSVYFII 
TLSDLECDYTNARSCCSKLNKWVIPELIGHTIVTV 
LLLMSLHWFIFLLNLPVATWNIYRYIMVPSGNM 
GVFDPTEIHNRGQLKSHMKEAMIKLGFHLLCFF 
M YL YSMIL ALIND 

3644 

A 

95 

2808 

TSCRHFPITSEDPLNYLLILTVERIYAYQALPLGFL 

FCSRDPVPEYLNHCGVKYVLISDRASFCALHIFFS 

PFRNVFRPAAGGGIAPPPRLWFQPSLSDAEMEIPK 

LLPARGTLQGGGGGGIPAGGGRVHRGPDSPAGQ 

VPTRRLLLPRGPQDGGPGRRREEASTASRGPGPS 

LFAPRPHQPSGGGGGGGDDFFLVLLDPVGGDVE 

TAGSGQAAGPVLREEAEEGPGLQGGESGANPAG 

PTALGPRCLSAVPTPAPISAPGPAAAFAGTVTIHN 

QDLLLRFENGVLTLATPPPHAWEPGAAPAQQPG 

CLIAPQAGFPHAAHPGDCPELPPDLLLAEPAEPAP 

APAPEEEAEGPAAALGPRGPLGSGPGVVLYLCPE 

ALCGQTFAKKHQLKMHLLTHSSSQGQRPFKCPL 

GGCGWTFTTSYKLKRHLQSHDKLRPFGCPAEGC 

GKSFTTVYNLKAHMKGHEQENSFKCEVCEESFP 

TQAKLGAHQRSHFEPERPYQCAFSGCKKTFITVS 

ALFSHNRAHFREQELFSCSFPGCSKQYDKACRLK 

IFfLRSHTGERPFLCDFDGCGWNFTSMSKLLRHKR 

KHDDDRRFMCPVEGCGKSFTRAEHLKGHSITHL 

STKPFVCPVAGCCARFSARSSLYIHSKKHLQDVD 
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SEQW 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Asparu"c Acid, 
E=Glutamic Acid, F=Phenyla!anmCj G=Glycine, H=Histidine, 
I-Isolcucine, K^Lysine, L=Leucinc, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





TWKSRCPISSCNKLFTSKJ4SMKTHMVKJRJHDK 

DLLAQLEAANSLTPSSELTSQRQNDLSDAEIVSLF 

SDVPDSTSAALLDTALVNSGILTIDVASVSSTLAG 

HLPANNNNSVGQAVDPPSLMATSDPPQSLDTSLF 

FGTAATGFQQSSLNMDEVSSVSVGPLGSLDSLA 

MKNSSPEPQALTPSSKLTVDTDTLTPSSTLCENSV 

SELLTPAKAEWSVHPNSDFFGQEGETQFGFPNAA 

GNHGSQKERNLITVTGSSFLV 

3645 

A 

2194 

1707 

TVSFHKTMASLKCSTVVCVICLEKPKYRCPACRV 

PYCSVVCFRKHKEQCNPETRPVEKK1RSALPTKT 

VKPVENKDDDDSIADFLNSDEEEDRVSLQNLKN 

LGESATLRSLLLNPHLRQLMVNLDQGEDKAKLM 

RAYMQEPLFVEFADCCLGIVEPSQNEES 

3646 

A 

85 

1948 

ERGGGKAAAAAAAAAAARALAASGQDPRPHPR 

APPWDDSGDDDEATTPADKSELHHTLKNLSLKL 

DDLSTCNDLIAKHGAALQRSLTELDGLKIPSESG 

EKLKVVNERATLFRJTSNAM1NACRDFLELAEIHS 

RKWQRALQYEQEQRVHLEETIEQLAKQHNSLER 

AFHSAPGRPANPSKSFIEGSLLTPKGEDSEEDEDT 

EYFDAMEDSTSF1TVITEAKEDSRKAEGSTGTSSA 

DWSSADNVLDGASLVPKGSSKVKRRVRIPNKPN 

YSLNLWSfMKNCIGRELSRIPMPVNFNEPLSMLQ 

RLTEDLEYHHLLDKAVHCTSSVEQMCLVAAFSV 

SSYSTTVHRIAKPFNPMLGETFELDRLDDMGLRS 

LCEQVSHHPPSAAHYVFSKHGWSLWQEITISSKF 

RGKYISIMPLGAIHLEFQASGNHYVWRKSTSTVH 

NI1VGKLWIDQSGDIEIVNHKTNDRCQLKFLPYSY 

FSKEAARKVTGVVSDSQGKAHYVLSGSWDEQM 

ECSKVMHSSPSSPSSDGKQKTVYQTLSAKLLWK 

KYPLPENAENMYYFSELALTLNEHEEGVAPTDS 

RLRPDQRLMEKGRWDEANTEKQRLEEKQRLSR 

RRRLEACGPGSSCSSEE 

3647 

A 

46 

5007 

PTGDACVSTSCELASALSHLDASHLTENLPKAAS 

ELGQQPMTELDSSSDLISSPGKKGAAHPDPSKTS 

VDTGQVSRPENPSQPASPRVTKCKARSPVRLPHE 

GSPSPGEKAAAPPDYSKTRSASETSTPHNTRRVA 

ALRGAGPGAEGMTPAGAVLPGDPLTSQEQRQGA 

PGNHSKALEMTGIHAPESSQEPSLLEGADSVSSR 

APQASLSMLPSTDNTKEACGHVSGHCCPGGSRE 

SPVTDIDSFIKELDASAARSPSSQTGDSGSQEGSA 

QGHPPAGAGGGSSCRAEPVPGGQTSSPRRAWAA 

GAPAYPQWASQPSVLDSINPDKHFTVNKNFLSN 

YSRNFSSFHEDSTSLSGLGDSTEPSLSSMYGDAE 

DSSSDPESLTEAPRASARDGWSPPRSRVSLHKED 

PSESEEEQIEICSTRGCPNPPSSPAHLPTQAAICPAS 

AKVLSLKYSTPRESVASPREKVACLPGSYTSGPD 

SSQPSSLLEMSSQEHETHADISTSQNHRPSCAEET 

TEVTSASSAMENSPLSKVARHFHSPPIILSSPNMV 

NGLEHDLLDDETLNQYETSINAAASLSSFSVDVP 

KNGESVLENLHISESQDLDDLLQKPKMIARRPIM 

AWKEINKHNQGTHLRSKTEKEQPLMPARSPDS 

KIQMVSSSQKKGVTVPHSPPQPKTNLENKDLSKK 

SPAEMLLTNGQKAKCGPKLKRLSLKGKAKVNSE 

APAANAVKAGGTDHRKPLISPQTSHKTLSKAVS 

QRLHVADHEDPDRNTTAAPRSPQCVLESKPPLAT 
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SEQ ID 

NO* 

Method 

Predicted 

Utijin III Jig 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 

IIUtlCUllUL 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 

ti — Vxiuid inn /axiu^ Mr * hi uyi<ti din iii, f Kj Glycine, [1 — rllSllOinC, 

I=I$oleucinc, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





SGPLKPSVSDTSIRTFVSPLTSPKPVPEQGMWSRF 

HMAVLSEPDRGCPTTPKSPKCRAEGRAPRADSG 

PVSPAASRNGMSVAGNRQSEPRLASHVAADTAQ 

PRPTGEKGGNIMASDRLERTNQLK1VEISAEAVSE 

TVCGNKPAESDRRGGCLAQGNCQEKSEIRLYRQ 

VAESSTSHPSSLPSHASQAEQEMSRSFSMAKLAS 

SSSSLQTAIRKAEYSQGKSSLMSDSRGVPRNSIPG 

GPSGEDHLYFTPRPATRTYSMPAQFSSHFGREGH 

PPHSLGRSRDSQVPVTSSVVPEAKASRGGLPSLA 

NGQGIYSVKPLLDTSRNLPATDEGDIISVQETSCL 

VTDKIKVTRRHYCYEQNWPHESTSFFSVKQRJKS 

FENLANADRPVAKSGASPFLSVSSKPPIGRRSSGS 

IVSGSLGHPGDAAARLLRRSLSSCSENQSEAGTL 

LPQMAKSPSIMTLTISRQNPPETSSKGSDSELKKS 

LGPLGIPTPTMTLASPVKR>JKSSVRHTQPSPVSRS 

KLQELRALSMPDLDKLCSEDYSAGPSAVLFKTEL 

EITPRRSPGPPAGGVSCPEKGGNRACPGGSGPKT 

SAAETPSSASDTGEAAQDLPFRRSWSVNLDQLLV 

SAGDQQRLQSVLSSVGSKSTILTLIQEAKAQSENE 

EDVCFIVLNRKEGSGLGFSVAGGTDVEPKSITVH 

RVFSQGAASQEGTMNRGDFLLSVNGASLAGLAH 

GNVLKVLHQAQLHKDALVVIKKGMDQPRPSAR 

QEPPTANGKGLLSRKTIPLEPGIGRSVAVHDALC 

VEVLKTSAGLGLSLDGGKSSVTGDGPLVIKRVY 

KG G A A EQ A G II EA G D E IL A INGKPL VG LMHFDA 

WNIMKSVPEGPVQLLiRKHRNSS 

3648 

A 

337 

1564 

KSRLSVTLMPVQLSEHPEWNESMHSLRISVGGLP 

VLASMTKAADPRFRPRWKVVLTFFVGAAILWLL 

CSHRPAPGRPPTHNAHNWRLGQAPANWYNDTY 

PLSPPQRTPAGIRYRIAVIADLDTESRAQEENTWF 

TYLKKGYLTFSDSGDKVAVEWDKDHGVLESHL 

AEKGRGMELSDLIVFNGKLYSVDDRTGVVYQIE 

GSKAVPWVILSDGDGTVEKGFKAEWLAVKDER 

LWGGLGKEWTTTTGDVVNENPEWVKVVGYK 

GSVDHENWVSNYNALRAAAGIQPPGYLIHESAC 

WSDTLQRWFFLPREASQERYSEKDDERKGANLL 

LSASPDFGDIAVSHVGAVVPTHGFSSFKFIPNTDD 

QIIVALKSEEDSGRVASYIMAFTLDGRFLLPETKI 

GSVKYEGIEFI 

3649 

A 

1 

775 

PTRPGSGSAGGARVGSGEFGVEMAALAPLPPLPA 

QFKSIQHHLRTAQEHDKRDPVVAYYCRLYAMQ 

TGMKIDSKTPECRKFLSKLMDQLEALKKQLGDN 

EAITQEIVGCAHLENYALKMFLYADNEDRAGRF 

HKNMIKSFYTASLLIDVITVFGELTDENVKHRKY 

ARWKATYIHNCLKNGETPQAGPVGIEEDNDIEEN 

EDAGAASLPTQPTQPSSSSTYDPSNMPSGNYTGI 

QIPPGAHAPANTPAEVPHSTGVAK 

3650 

A 

20 

963 

KMAATLGPLGSWQQWRRCLSARDGSRRLLLLL 

LLGSGQGPQQVGAGQTFEYLKREHSLSKPYQGE 

APRPCFLRDWELQVHFKIHGQGKKNLHGDGLAI 

WYTKDRMQPGPVFGNMDKPVGLGVFVDTYPNE 

EKQQERVFPYISAMVNNGSLSYDHERDGRPTEL 

GGCTArVRNLHYDTFLVIRYVKRHLTIMMDDDGK 

HEWRDCIEVPGVRLPRGYYFGTSSITGDLSDNHD 

VISLKLFELTVERTPEEEKLHRI)VFLPSVDNMKL 
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SEQ ID 

NO: 

Method 

Predicted 

Wtgl II 111 ll£ 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 

U UCICUIIUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 

vjiuu rnic Acio, f— rnenyi*iianinc, oiycine, n — Misiioinc, 
I^soleucine, K=Lysine, LHLeucinc, M=Metbioninc, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine» 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





PEMTAPLPPLSGLALFLIVFFSLVFSVFAIVIGIILY 
NKWQEQSRKRFY 

3651 

A 

1 

1218 

RSWAYVKKCKNNMCPNRGLHDGPEPCWLHHA 

AGTVSAVQARGLQPSQSRSRPRVPGLATALAYG 

PAHTPPLSRIGWAMQPPPPGPLGDCLRDWEDLQ 

QDFQNIQVSAAADAGSPPSRVSLAQGQGSGSPGC 

KPSLPAEAEGAAQELENQMKERQGLFFDMEAYL 

PKKNGLYLSLVLGNVNVTLLSKQAKFAYKDEYE 

KFKLYLTIILILISFTCRFLLNSRVTDAAFNFLLVW 

YYCTLTIRESILINNGSRIKGWWVFHHYVSTFLSG 

VMLTWPDGLMYQKFRNQFLSFSMYQSFVQFLQ 

YYYQSGCLYRLRALGERHTMDLTVEGFQSWMW 

RVLTFLLPFLFFGHFWQLFNALTLFNLAQDPQCK 

EWQVLMCGFPFLLLFLGNFFTTLRVVHHKFHSQ 

RHGSKKD 

3652 

A 

640 

164 

VTTSCIIPFAFGLGVRASERLAEIDMPYLLKYQPM 

MQT1GQKYCMDPAVIAGVLSRKSPGDKILVNMG 

DRTSMVQDPGSQAPTSWISESQVFQTTEVLTTRI 

TELQRRFPTWTPDQYLRGGLCAYSGGAGYVRSS 

QDLSCDFCNDVLARAKYLKRHGF 

3653 

A 

2 

909 

IVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEVVLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGNIGIRJKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGY1CFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEVVK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 

3654 

A 

2 

909 

IVRRDWQEVSDIHLAMANCKMTKSIRFPALEHC 

YTGGEVVLPKDQEEWKRRTGLLLYENYGQSETG 

LICATYWGMKIKPGFMGKATPPYDVQFHMEASV 

ENCIIVSMNTADPGSQGITHSLLLQVIDDKGSILPP 

NTEGNIGIRJKPVRPVSLFMCYEGDPEKTAKVEC 

GDFYNTGDRGKMDEEGYICFLGRSDDIINASGYR 

IGPAEVESALVEHPAVAESAVVGSPDPIRGEVVK 

AFIVLTPQFLSHDKDQLTKELQQHVKSVTAPYKY 

PRKVEFVSELPKTITGKIERKELRKKETGQM 

3655 

A 

2 

2364 

SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDTG 

MVAHINNSRLKAKGVGQHDNAQNFGNQSFEEL 

RAACLRKGELFEDPLFPAEPSSLGFKDLGPNSKN 

VQNISWQRPKDIINNPLFIMDGISPTDICQGILGDC 

WLLAAIGSLTTCPKLLYRVVPRGQSFKKNYAGIF 

HFQIWQFGQWVNWVDDRLPTKNDKLVFVHST 

ERSEFWSALLEKAYAKLSGSYEALSGGSTMEGL 

EDFTGGVAQSFQLQRPPQNLLRLLRKAVERSSL 

MGCSIEVTSDSELESMTDKMLVRGHAYSVTGLQ 

DVHYRGKMETLIRVRNPWGRIEWNGAWSDSAR 

EWEEVASDIQMQLLHKTEDGEFWMSYQDFLNN 

FTLLEICNLTPDTLSGDYKSYWHTTFYEGSWRTG 

SSAGGCRNHPGTFWTNPQFKISLPEGDDPEDDAE 

GNVWCTCLVALMQKNAVRHARQQGAQLQTIGF 

VLYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEIF 

TNSREVSSQLRLPPGEYIIIPSTFEPHRDADFLLRV 

FTEKHSESWELDEVNYAEQLQEEKVSEDDMDQ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=AspRrtic Acid, 
E=Glutamic Acid, F=Phcnylalanine, G=G lycine, H=Histidine, 
1-Isoteucine, K=Lysinc, L=Leucine, M=Methionine, 
N=Asparaginc, P^ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine t W=Tryptophan, Y=Tyrosine, 
X=Dnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





DFLHLFKIVAGEGK£IGVYELQRLLNRMAIKFKS 

FKTKGFGLDACRCMINLMDKDGSGKLGLLEFKI 

LWKKLKKWMDIFRECDQDHSGTLNSYEMRLVIE 

KAGIKLNNKVMQVLVARYADDDLIIDFDSFISCF 

LRLKTMFTFFLTMDPKNTGHICLSLEQVLGEGW 

EGICRIAPACPSTPPPPSSDVPGPASCPRLFPPWDL 

LP V STV A ADDH VGIE AL 

3656 

A 

3 

174 

PLCTHYLLPELPEKSSRTSPRSRPGNMLSGDPHLP 
QPLCHCLDHCPCCFSGKRLVA 

3657 

A 

1 

444 

DTRSTYHNAHSLPTYVKSPAPCQMTYIKSPAPCQ 

TQTCYVQGASPCQSYYVQAPASGSTSQYCVTDP 

CSAPCSTSYCCLAPRTFGVSPLRRWIQRPQNCNT 

GSSGCCENSGSSGCCGSGGCGCSCGCGSSGCCCL 

GIIPMKSRSPALL 

3658 

A 

92 

1537 

SEAPVQPQPYTOTSFYSTSSCPLGCTMAPGARNV 

FVSP1DVGCQPVAEANAASMCLLANVAHANRVR 

VGSTPLGRPSLCLPPTSHTACPLPGTCH1PGNIGIC 

GAYGKNTLNGHEKETMKFLNDRLANYLEKVRQ 

LEQENAELETTLLERSKCHESTVCPDYQSYFRTIE 

ELQQKILCSKAENARLIVQIDNAKLAADDFRJKL 

ESERSLHQLVEADKCGTQKLLDDATLAKADLEA 

QQESLKEEQLSLKSNHEQEVKILRSQLGEKFRIEL 

DIEPTIDLNRVLGEMRAQYEAMVETNHQDVEQ 

WFQAQSEGISLQAMSCSEELQCCQSEDLELRCTV 

NALEVERQAQHTLKDCLQNSLCEAEDRYGTELA 

QMQSLISNLEEQLSEIRADLERQNQEYQVLLDVK 

ARLENEIATYRNLTPLQSLFHACLLYFLSKLWPC 

F1RWVSLWPWSQHGEMILKARVRRLRLVALGSG 

VPSPCPVFLQD 

3659 

A 

2 

402 

DLLQCLNQLYSASTEMSCQQSQQQCQPPPKCTP 
KCPPKCTPKCPPKCPPKCPPQYSAPCPPPVSSCCG 
SSSGGCCSSEGGGCCLSHHRPRQSLRRRPQSSSC 
CGSGSGQQSGGSSCCHSSGGSGCCHSSGGCC 

3660 

A 

26 

710 

CSAVEVKMAARTAFGAVCRRLWQGLGNFSVNT 

SKGNTAKNGGLLLSTNMKWVQFSNLHVDVPKD 

LTKPVVTISDEPDILYKRLSVLVKGHDKAVLDSY 

EYFAVLAAKELGISIKVHEPPRKIERFTLLQSVHI 

YKKHRVQYEMRTLYRCLELEHLTGSTADVYLEY 

IQRNLPEGVAMEVTKFCFFIFLDTIRTVTRTHQGA 

NLGNTIRRKRRKQVIKPQGGHFCLNLK 

3661 

A 

2 

370 

DVSVAASEPTVYRNPTKMSCQQNQQQCQPPPKC 
PIPKYPPKCPSKCASSCPPPISSCCGSSSGGCCSSG 
GCGCCSSEGGGCCLSHHRHHRSHCHRPKSSNCY 
GSGSGQQSGGSGCCSGGGCC 

3662 

A 

205 

1277 

RKSLPHPNPQKMLKKPLSAVTWLCIFIVAFVSHP 

A WLQKLSKHKTPAQPQLKAANCCEE VKELKAQ 

VANLSSLLSELNKKQERDWVSVVMQVMELESN 

SKRMESRLTDAESKYSEMNNQID1MQLQAAQTV 

TQTSAGKETSPLRERGVPPHLQHCFYIPPDDFLGS 

PELEVFCDMETSGGGWTIIQRRKSGLVSFYRDW 

KQYKQGFGSIRGDFWLGNEHDHRLSRQPTRLRVE 

MEDWEGNLRYAEYSHFVLGNELNSYRLFLGNY 

TGNVGNDALQYHNNTAFSTKDKDNDNCLDKCA 

QLRKGGYWYNCCTDSNLNGVYYRLGEHNKHLD 

GITWYGWHGSTYSLKRVEMKIRPEDrTCP 
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SEQ ID 
NO: 

Method 

Predicted 

hpotnnino 
L'CglUI!!!!* 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 

HUt'UJUUC 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Atanine C=Cysteine, D=Aspartic Acid, 

F=nint'>mir Arirt F=Phf nvlnla nine Olvcinp H=Hicttriini> 

I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon,/=possible nucleotide deletion, 
\=possiblc nucleotide insertion 

3663 

A 

64 

1456 

LSSAKETLAQMYNTVWNMEDLDLEYAKTDINC 

GTDLMFYIEMDPPA LPPKPPKPTTVANNGMNNN 

MSLQDAEWYWGDISREEVNEKLRDTADGTFLV 

RDASTKMHGDYTLTLRKGGNNKLIKIFHRDGKY 

GFSDPLTFSSVVELINHYRNESLAQYNPKLDVKL 

LYPVSKYQQDQVVKEDN1EAVGKKLHEYNTQFQ 

EKSREYDR1.YEEYTRTSQEIQMKRTA1EAFNETIK 

IFEEQCQTQERYSKEYmKFKMGNEKEIQRIMFIN 

YDICLKSRISEIIDSRRRLEEDLKKQAAEYKEIDKR 

MNSIKPDL1QLRKTRDQYLMWLTQKGVRQKKL 

NEWLGNENTEDQYSLVEDDEDLPHHDEKTWNV 

GSSNRNKAENLLRGKRDGTFLVRESSKQGCYAC 

SVVVDGEVKHCYINKTATGYGFAEPYNLYSSLK 

ELVLHYQHTSLVQHNDSLNVTLAYPVYAQQRR 

3664 

A 

944 

406 

GATVEDQSCNFGSLRWWSVPHISARSCPDPLLS 
RTGRVPGGRGAGLPRrlHSPRCCLQVFFNGANVR 
QVDVPTLTGAFGILAAHVPTLQVLRPGLVVVHA 
EDGTTSKYFVSSGSIAVNADSSVQLLAEEAVTLD 
MLDLGAAKANLEKAQAELVGTADEATRAEIQIR 
IEANEALVKALE 

3665 

A 

98 

1388 

ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSFHEH 

RHQSGRCLSTGMAPNLKGRPRKKKPCPQRRDSF 

SGVKDSNNNSDGKAVAKVKCEARSALTKPKNN 

HNCKKVSNEEKPKVAIGEECRADEQAFLVALYK 

YMKERKTPIERIPYLGFKQINLWTMFQAAQKLG 

GYETITARRQWKHIYDELGGNPGSTSAATCTRR 

HYERLILPYERFIKGEEDKPLPPIKPRKQENSSQE 

NENKTKVSGTKRIKHEIPKSKKEKENAPKPQDAA 

EVSSEQEKEQETLISQKSIPEPLPAADMKKKIEGY 

QEFSAKPLASRVDPEKDNETDQGSNSEKVAEEA 

GEKGPTPPLPSAPLAPEKDSALVPGASKQPLTSPS 

ALVDSKQESKLCCFTESPESEPQEASFPRLPHHTG 

HRWQTRMRRRMTNCPPWQITLPTAP 

3666 

A 

113 

1492 

LLQEMCTKTIPVLWGCFLLWNLYVSSSQTIYPGI 

KARITQRALDYGVQAGMKMIEQMLKEKKLPDL 

SGSESLEFLKVDYVNYNFSNIKISAFSFPNTSLAF 

VPGVGIKALTNHGTANISTDWGFESPLFVLYNSF 

AEPMEKPILKNLNEMLCPIIASEVKALNANLSTLE 

VLTKIDNYTLLDYSLISSPEITENYLDLNLKGVFY 

PLENLTDPPFSPVPFVLPERSNSMLYIGIAEYFFKS 

ASFAHFTAGVFNVTLSTEEISNHFVQNSQGLGNV 

LSR1AEIYILSQPFMVR1MATEPPIINLQPGNFTLDJ 

PASIMMLTQPKNSTVETIVSMDFVASTSVGLV1L 

GQRLVCSLSLNRFRLALPESNRSNIEVLRFENILSS 

1LHFGVLPLANAKLQQGFPLPNPHKFLFVNSD1EV 

LEGFLLISTDLKYETSSKQQPSFHVWEGLNLISRQ 

WRGKSAP 

3667 

A 

1 

181 

FRGRLGSGRNGGGSMNAPPAFESFLLFEGEKITIN 
KDTKVPNACLFTrNKEDHTLGNIIK 

3668 

A 

212 

431 

VAGEAVPFFPMMYSEPLKPSYLALVLWYFLLTG 
YCITKPEVIFBCIEQGEEPWILEKGFPSQCHPAKYL 
WCLHD 

3669 

A 

458 

1056 

FSGVCFAGIAGSMATLLHDAVMNPAEVVKQRLQ 
MYNSQHRSAISCIRTVWRTEGLGAFYRSYTTQLT 
MNIPFQSmFITYEFLQEQVNPrmTYNPQSffllSGG 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C-Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenyla)anine, G=Glycinc, H=Histidine, 
I=Isoteucinc, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serinc, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosinc, 
X— Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





LAGALAAAATTPLDVCKTLLNTQENVALSLANIS 
GRLSGMANAFRTVYQLNGLAGYFKG1QARVIYQ 
MPSTAISWSVYEFFKYFLTKRQLENRAPY 

3670 

A 

145 

298 

RNPCPLTFLPSTLMVLLLSLTFFSALTFHSICQLRN 
TGVEVDIVFQRVSFL 

3671 

A 

3 

462 

ILKVAKKERTMSSLPVPYKLPVSLSVGSCVHKGT 
P1HSFINDPQLQVDFYTDMDEDSDIAFRFRVHFG 
NHVVMNRREFGIWMLEETTDYVPFEDGKQFELC 
IYVHYNEYEIKVNGHTHLRALSHRIPPSFVEDGC I 
KCPRRYLPWTSVCVCN 

3672 

A 

1 

1028 

HYAKLGTRPRLKFMSSPSLSDLGKREPAAAADE 

RGTQQRRACANATWNSIHNGVIAVFQRKGLPDQ 

ELFSLNEGVRQLLKTELGSFFTEYLQNQLLTKGM 

VILRDK1RFYEGQKLLDSLAETWDFFFSDVLPML 

QAIFYPVQGKEPSVRQLALLHFRNAITLSVKLED 

ALARAHARVPPAIVQMLLVLQGVHESRGVTEDY 

LRLETLVQKVVSPYLGTYGLHSSEGPFTHSCILEK 

RLLRRSRSGDVLAKNPVVRSKSYNTPLLNPVQE 

HEAEGAAAGGTSIRRHSVSEMTSCPEPQGFSDPP 

GQGPTGTFRSSPAPHSGPCPSRLYPTTQPPEQGLD 

PTRS 

3673 

A 

2 

712 

RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WFHVDFCLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKMNNVNVVYTPWSNLKKTAD 

MDVGQ1GFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKJCR 

EKEEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 

3674 

A 

2 

712 

RPPRVWYPELRELSAAAPRWSHRTAPGIMVFYF 

TSSSVNSSAYTIYMGKDKYENEDLIKHGWPEDI 

WFHVDKLSSAHVYLRLHKGENIEDIPKEVLMDC 

AHLVKANSIQGCKMNNVNVVYTPWSNLKKTAD 

MDVGQIGFHRQKDVKIVTVEKKVNEILNRLEKT 

KVERFPDLAAEKECRDREERNEKKAQIQEMKKR 

EKEEMKKKREMDELRSYSSLMKVENMSSNQDG 

NDSDEFM 

3675 

A 

921 

1321 

VTLAKMRVHISSCLKVQEQMANCPKFVPVVPTS 
QPIPSNIPNRSTFACPYCGARNLDQQELVKHCVE 
SHRSDPNRVVCPICSAMPWGDPSYKSANFLQHL 
LHRHKFSYDTFVDYSIDEEAAFQAALALSLSEN 

3676 

A 

3 

1856 

TLGRWLLGVYETVAPTLACLPRPRLRRRRRRRR 1 

RRMISRYTRXAVPQSLELKGITKHALNHHPPPEK 

LEEISPTSDSHEKDTSSQSKSDITRESSFTSADTGN 

SLSAFPSYTGAGISTEGSSDFSWGYGELDQNATE 

KVQTMFTAEDELLYEQKLSVHTKSLQEECQQWT 

ASFPHLR1LGRQIITPSEGYRLYPRSPSAVSASYET 

TLSQERDSTIFGIRGKKLHFSSSYAHKASSIAKSSS 

FCSMERDEEDSIIVSEGIIEEYLAFDHIDIEEGFHG 

KKSEAATEKQKiGYPPIAPFYCMKEDVLAYVFD 

S V WCKV VSCMEQLTRSH WEGF ASDDESNVA VT 

RPDSESSCVLSELHPLVLPRVPQSKVLYITSNPMS 

LCQASRHQPNVhTOLLVHGMPLQPl^SLMDKLL 

DLDDKLLMRPGSSTILSTRNWPNRAVEFSTSSLS 

YTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEE1L 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G-GIycine, H=Histidine, 
]=Isoleucine, K^Lysine, L=Leucine, M^Methionine,* 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serinc, 
T=Threonine, V=Valine, W-Tryptophan, Y-Tyrosine, 
X=Unknown 7 *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 





RGARVPVAPDSLSSPSPTPLSRNNLLPPrGTAEVE 

HVSTVGPQRQMKPHGDSSRAQSAVVDEPNYQQ 

PQERLLLPDFFPRPNTTQSFLLDTQYRRSCAVEYP 

HQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 

P 

3577 

A 

/V 


7<%7 

JV/JPI C\n a tetvt t pin /^dtt \/\i/T rTDnuA/fcr i u/r , cr' 
iviKJL ^ O/LLr VLJLrol-«VjrlL» V WLr IKlJriMoCxWCfcvj 

PRMLSWCPFYKVLLLVQTAIYSVVGYASYLVWK 

DLGGGLGWPLALPLGLYAVQLTISWTVLVLFFT 

VHNPGLALLHLLLLYGLVVSTALIWHPINKLAAL 

LLLPYLAWLTVTSALTYHLWRDSLCPVHQPQPT 

EKSD 

3678 

A 

20 

1508 

RGKAEFFLAMAGTNALLMLENFIDGKFLPCSSY1 

DSYDPSTGEVYCRVPNSGKDEIEAAVKAAREAFP 

SWSSRSPQERSRVLNQVADLLEQSLEEFAQAESK 

DQGKTLALARTMDIPRSVQNFRFFASSSLHHTSE 

CTQMDHLGCMHYTVRAPVGVAGLISPWNLPLY 

LLTWKIAPAMAAGNTVIAKPSELTSVTAWMLCK 

LLDKAGVPPGVVNIVFGTGPRVGEALVSHPEVPL 

ISFTGSQPTAERITQLSAPHCKKLSLELGGKNPAII 

FEDANLDECIPATVRSSFANQGEICLCTSRIFVQK 

SlYbbFLKR>VEATRKWKVGIPSDPLVSIGALISK 

AHLEKVRSYVKRALAEGAQIWCGEGVDKLSLPA 

RNQAGYFMLPTVITDIKDESCCMTEEIFGPVTCV 

VPFDSEEEV1ERANNVKYGLAATVWSSNVGRVH 

RVAKKLQSGLVWTNCWLIRELNLPFGGMKSSGI 

GREGAKDSYDFFTEIKTITVKH 

3679 

A 

1862 

502 

MAGTKPYMEIQTTIREYYEHLYANKLENLEEMD 

KFLDTYTLPRLNQEEVESLNRPITGSEIEAIINSLP 

TKKIPGPDRFTAKFYQRYKEELSNLIHYLGLSHH 

LLALNFIIVSFGKKSAWSSAQVKVTDTDFDGVEV 

RVFEGPPKPEEPLKRSVVYIHGGGWALASAKIRY 

YDELCTAMAEELNAV1VSIEYRLVPKVYFPEQIH 

DVVRATKYFLKPEVLQKYMVDPGRICISGDSAG 

GNLAAALGQQFTQDASLKNKLKLQALIYPVLQA 

LUrN 1 ri> YC^yN VN 1 rULJrK Y VM VK Y W VDYrKG 

NYDFVQAMIVNNHTSLDVEEAAAVRARLNWTS 

LLPASFTKNYKPVVQTTGNARIVQELPQLLDARS 

APLIADQAVLQLLPKTYILTCEHDVLRDDGIMYA 

KRLESAGVEVTLDHFEDGFHGCMIFTSWPTNFSV 

GIRTRNSYIKWLDQNL 

3680 

A 

249 

2146 

RSWGAPWFWRMRLLRRRHMPLRLAMVGCAFV 

LFLFLLHRDVSSREEATEKPWLKSLVSRKDHVLD 

LMLEAMNNLRDSMPKLQIRAPEAQQTLFSINQSC 

LPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 

SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSL 

GPDTRPPECVDQKFRRCPPLATTSVIIVFHNEAWS 

TLLRTVYSVLHTTPAILLKEIILVDDASTEEHLKE 

KLEQYVKQLQVVRVVRQEERKGLITARLLGASV 

AQAEVLTFLDAHCECFHGAVLEPLLARIAEDKTV 

LTFGWETLPPHEKQRRKDETYPIKSPTFAGGLFSI 

SKSYFEHIGTYDNQMEIWGGENVEMSFRVWQC 

GGQLEIIPCSVVGHVFRTKSPHTFPKGTSVIARNQ 

VRLAEVWMDSYKKIFYRRNLQAAKMAQEKSFG 

DISERLQLREQLHCHNFSWYLHNVYPEMFVPDL 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine t 
I=lsoleucine, K=Lysinc, L=Leucine, M=Methiontne, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T-Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon,/=possible nucleotide deletion, 
V=possible nucleotide insertion 





TPTFYGAIKNLGTNQCLDVGENNRGGKPLIMYS 
CHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKG 
ALGLGSCHFTGKNSQVPKDEEWELAQDQLIRNS 
GSGTCLTSQDKKPAMAPCNPSDPHQLWLFV 

3681 

A 

2982 

1869 

LKDTLKSQMTQEASDEAEDMKEAMNRMIDELN 

KQVSELSQLYKEAQAELEDYRKRKSLEDVTAEY 

IHKAEHEKLMQLTNVSRAKAEDALSEMKSQYSK 

VLNELTQLKQLVDAQKENSVSITEHLQVITTLRT 

AAKEMEEKISNLKEHLASKEVEVAKLEKQLLEE 

KAAMTDAMVPRSSYEKLQSSLESEVSVLASKLK 

ESVKEKEKVHSEVVQIRSEVSQVKI^ 

KSKEQEVNELLQKFQQAQEELAEMKRYSESSSK 

LEEDKDKKINEMSKEVTKLKEALNSLSQLSYSTS 

SSKRQSQQLEALQQQVKQLQNQLAECKKQHQE 

VISVYRMHLLYAVQGQMDEDVQKVLKQILTMC 

KNQSQKK 

3 /COO 

A 

A 

447 

1024 

AQALTAGRQLALAAPFIAPISP1SLPRLNPPSQSW 

NSTPFFKVKLPPQKEV1TSDELMAHLGNCLLSIKP 

QEKSEGLQLNFQQNVDDAMTVLPKLATGLDVN 

VRFTGVSDFEYTPECSVFDLLGIPLYHGWLVDPQ 

QSPEAVRAVGKLSYNQL/VGEDHHLQTLQ*HQP 

RDRKPDCRAVPGDHRGPSDLPRTV 

3683 

A 

2 

942 

LEIKQEEKFVGQCIKEELMHGECVKEEKDFLKKE 

IVDDTKVKEEPP1NHPVGCKRKLAMSRCETCGTE 

EAKYRCPRCMRYSCSLPCVKKHKAELTCNGVRD 

KTAYISIQQFTEMNLLSDYRFLEDVARTADHISR 

DAFLKRP1SNKYMYFMKNRARRQG1NLKLLPNG 

FTKRKENSTFFDKKKQQFCWHVKLQFPQSQA\ST 

♦KKRVPDDKTINEILKPYIDPEKSDPV1RQRLKAYI 

RSQTGVQILMKIEYMQQNLVRYYELDPYKSLLD 

NLRNKVIIEYPTLHVVLKGSNNDMKVLHQVKSE 

STKNVGNEN 

3684 

A 

119 

1533 

SLQEN VQEKRVRVCPG LGGLLPNGTPSITAA AAP 

QVLWRHVQPGCSHHLHACVIRAACRAGEGHAD 

RHAGPPET/PVTLPSSWPWSSPWERQCPMH\L*AP 

GHAFRPVPTEHRRGWAALGHHRAAAGPLREPAS 

GSQPAPASC*PECHHGCPEQTRQCQDLLREAVV 

APEQRG*PCAHLQT*ATATTLCPQVPAGRVWQP 

GHSCHLLPHRHDGSH*HHCAAHRRPVTRRQAAH 

GVPLPDACYSPHHTLPAAPPPATRPAGHTATHPE 

♦GGDLTPVPDGPHDCPRDVQGIPGAGGGSQLAPC 

CPPFPAAP VS VQGTQGLGPKN VLH* Q WEGIR WQ 

KEPE/PGPPPEVELKRGAKCR1GDHGLGAVLGQG 

EYAS*SPSIPW*ASSSACPPLHPTP/TVYTQSPAAA 

PGWTRPPSP/PPPGLYPGP/PASHAPGVRGGISHQL 

Y^Lr^JLCKl^CCbCr/rrPrAHGORCPbLLPPbALAK 

LLL 

3685 

A 

101 

438 

AWVLQCKINTELQTEVVMLKSMVLWLGEQVQS 
LQLQQQLHCHFNHTHICVT^EYNXKEYPWDLV 

PSL*AWTEFQQGLE 

3686 

A 

105 

845 

VSDVVKNQLVEVQCRQDGCDAVENVHQMFMF 
NWFTDCLWTLFLSNYQPSVESSSPGGSATSDDHE 
FDPSADML VHDFDDERTLEEEEMMEGETNFS SEI 
EDL AREGDMPIHELLSL YG YG STVRLPEEDEEEE 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I^Isoteucine, K=Lysinc, L=Leucine, M-Mcthionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R-Arginine, S=Serine, 
T=Threonine, V-Valine, W^Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





EEEEEGEDDEDADNDDNSGCSGENKEENIKDSS 
GQEDETQSSNDDPSQSVASQDAQEIIRPRRCKYF 
DTNSEVEEESEEDEDYIP/SIISFFQSSDGI*SSSSSE 
DWKKEIMVGS 

3687 

A 

49 

1225 

PVLVTSLRMREADTLRPPQLMEVSADIISTVEFN 

HTGELLATGDKGGRVVIFQREPESKNAPHSQGE 

YDVYSTFQSHEPEFDYLKSLEIEEK1NKIKWLPQQ 

N A A H S LLSTNDKTIKL WK1TERDKRPEG YNLKDE 

EGKLKDLSTVTSLQVPVLKPMDLMVEVSPRRIFA 

NGHTYHINSISVNSDCETYMSADDLRINLWHLAI 

TDRSFTP\NIVDIKPANMEDLTEVITASEFHPHHC 

NLFVYSSSKGSLRLCDMRAAALCDKHSKLFEEPE 

DPSNRSFFSEUS\SVSDVKFSHSDRYMLTR\DYLT 

VKVWDLNMEARPIETYQVHDYLRSKLCSLYEND 

CIFDKJFECAWNGSDR/IIMTGAYr^ 

KRDVTLEASRGSSKPRAVL 

T /TOO 

A 

1 

401 

KKVPGRLSEMSFSLNFTLPANTTSSPVTaDCGPSL 
ULAAUlr LL V A 1 AL,L V ALLr 1 LlilKKKooIcAMbJb 
olJK^v^blojDlJJL/lNrlvlobr^ 1 rlcKiN 1 JVKjA^ii 
AHIYVKTVAGSEEPVHDRYRPTIEMERRR 

3689 

A 

698 

889 

GRVLVHCAMGVSRSATLVLAFLMIYENMTLVEA 
IPDGAGPPQISALTQAFVRQLQVLDNRLGRE 

3690 

A 

61 

153 

MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 

3691 

A 

61 

153 

MGAHLVRRYLGDASVEPDPLQMPTFPPDYGF 

3692 

A 

3 

2831 

PLVRRLLRQTLRRVGGARAVREAVMRAVLTWR 

DKAEHCINDIAFKPDGTQLILAAGSRLLVYDTSD 

GTLLQPLKGHKDTVYCVAYAKDGKRFASGSAD 

KSVIIWTSKLEG1LKYTHNDAIQCVSYNPITHQLA 

SCSSSDFGLWSPEQKSVSKHKSSSKIICCSWTNDG 

QYLALGMFNGIISIRNKNGEEKVKIERPGGSLSPI 

WSICWNPSSRWESFWMNRENEDAEDV1VNRYIQ 

EIPSTLKSAVYSSQGSEAEEEEPEEEDDSPRDDNL 

EERNDILAVADWGVQKVSFYQLSGKQIGKDRAL 

NFDPCCISYFTKGEYILLGGSDKQVSLFTKDGVR 

LGTVGEQNSWVWTGQAKPDSNYVVGGCQDGTI 

SFYQLIFSTVHGLYKDRYAYRDSMTDVIVQHLIT 

EQKVTUKCKELVKKIAIYRNRLAIQLPEKILIYELY 

SEDLSDMHYRVKEKIIICKFECNLLVVCANHIILC 

QEKRLQCLSFSGVKEREWQMESLIRYIKVIGGPP 

GREGLLVGLKNGQILKIFVDNLFAIVLLKQATAV 

RCLDMSASRKKLAVVDENDTCLVYDIDTKELLF 

QEPNANSVAWNTQCEDMLCFSGGGYLNIKASTF 

PVHRQKLQGFVVGYNGSKIFCLHVFSISAVEVPQ 

SAPMYQYLDRKLFKEAYQIACLGVTDTDWRELA 

MEALEGLDFETAKKERKKRGETNNDLFLADVFS 

YQGKPHEAAKLYKRSGHENLALEMYTDLCMFE 

YAKDFLGSGDPKETKMLITKQADWARN1KEPKA 

A VEM YIS AGEHVKAIEICGDHG WVDMLIDI ARK 

LDKAEREPLLLCATYLKKLDSPGYAAETYLKMG 

tw VQ1 \/r\T tn/CTAT? n/nc afai f^uv uvTmiST^rwv 
IJjLJvoL, vl^Lri V n. 1 WiJiiAr ALUriJvrir Hr JvULJl I 

MPYAQWLAENDRFEEAQKAFHKAGRQREAVQV 

LEQLTNNAVAESRFNDAAYYYWMLSMQCLDl^ 

QDPAQKD 

3693 

A 

3 

1099 

SSFPTCMRTVFHSNTSVSSLLHRPGHVTPQLTIHG 
GWRHHRDHTAIDEWDFNPSKFLIYTCLLLFSVLL 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 

Amino acid sequence (A=Alaninc C=Cysteine, D=Asparttc Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P=Prolinc, Q=Glutaminc, R=Arginine, S=Serinc, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





PLRLDGIIQWSYWAVFAPIWLWKLLVVAGASVG 

AGVWARNPRYRTEGEACVEFKAMLlAVGrHLLL 

LMFEVLVCDRVERGTHFWLLVFMPLFFVSPVSV 

AACVWGFRHDRSLELEILCSVNILQFIFIALKLDR1 

IHWPWLVVFVPLWILMSFLCLVVLYYIVWSLLFL 

RSLDVVAEQRRTHVTMAISWITIVVPLLTFEVLL 

VHRLDGHNTFSYVSIFVPLWLSLLTLMAT1FRRX 

GGNHWWFAIRRDF/CQDQLPQPTGKPPPPPLTDH 

HGEKALPLQNKDRGSWPASRGSPRLL 

3694 

A 

A 

483 

761 

PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIHE 
LVEMKSVLKSDEEHMEEAITSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 

3695 

A 

483 

761 

PRSLIDYKSYMDTKLLVARFLEQSSCTMTPDIFEE 
LVENIKSVLKSDEEHMEEA1TSASFLEQIMAHSX 
QHIRAHKLPXETAGLXTSELRXLTP 

3696 

A 

456 

733 

LSAALWEEPILSLWSETKELTNRGKMNYPQIGPH 
RPHVKGLRVRPGPGTLSNAPKSLCPGMSNSDRGI 
H\GGEGQGPGKRAGHLGRGGGMSFL 

3697 

A 

877 

1873 

V WL * TLS *HTC ALMTVCRSC L VK YLEENNTCPT 
CRMHQSriPLQYIGHDRTMQDIVYKLVPGLQEA 
EMRKQREFYHKLGMEVPGDIKGETCSAKQHLDS 
HRNGETKADDSSNKEAAE 

3698 

A 

1 

572 

KQCGIPHEVVRDENSSVYAEVSRLLLATGHWKR 

LRJRJDNPRFNLMLGERNRLPFGRLGHEPGLVQLV 

NYYRGADKLCRKASLVKLIKTSPELAESCTWFPE 

SYVIYPTNLKTPVAPAQNGIQPPISNSRTDEREFFL 

ASYNRKKEDGEGNVW1AKSSAGAKVWVQW*M 

TDLEEEIDIPSPVGLGLESEWPL 

3699 

A 

2008 

2432 

LHCKMGALETQTHPCSQNMLRSLQKCCCKVEE 

HHLQPVQVLQTLLHSATAGTGCRRPARPPPAPPT 

PTPWRSRQSGKQSERAS*LKGRGRYGLGALGGR 

GGRALGGSRWPPPLPGETLFSGCKHRRRRRGSD 

AAPGEEAGT 

3700 

A 

33 

1318 

GYQIGMALASGPARRALAGSGQLGLGGFGAPRR 

GAYEWGVRSTRKSEPPPLDRVYEIPGLEPITFAG 

KMHFVPWLARPIFPPWDRGYKDPRFYRSPPLHE 

HPLYKDQACYIFHHRCRLLEGVKQALWLTKTKL 

IEGLPEKVLSLVDDPRNHIENQDECVLNVISHARL 

WQTTEEIPKRETYCPVIVDNLIQLCKSQILKHPSL 

ARR1CVQNSTFSATWNRESLLLQVRGSGGARLST 

KDPLPTIASREEIEATKNHVLETFYPISPIIDLHECN 

IYDVKNDTGFQEGYPYPYPHTLYLLDKANLRPH 

RLQPDQLRAKM1LFAFGSALAQARLLYGNDAKV 

LEQPVVVQSVGTDGRVFHFLVFQLNTTDLDSNE 

GVKi^AWVDSDQLLYQHFWCLPVIKKRVV 

VGPVGFKPETFRKFLALYLHGAA 

3701 

A 

86 

465 

WTLCGPEAGMVGYDPKPDGRNNTKFQVAVAGS 
VSGLVTRALISPFDVIKIRFQLQHERLSRSDPSAK 
YHGILQASRQILQEEGPTAFWKGHVPAQILSIGY 
CjAVv^r Lor bMLILLVrlKUoV YDAKb 

3702 

A 

166 

814 

GFWEKTNQSSHSMDPLGAPSQFVDVDTLPSWGD 
SCQDELNSSDTTAEIFQEDTVRSPFLYNKDVNGK 
WLWKGDVALLNCTAIVNTSNESLTDKNPVSESI 
FMLAGPDLKEDLQKLKGCRTGEAQLTKGFNLAA 
RFIIHTVGPKYKSRYRTAAESSLYSCYRNVLQLA 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid. K=Phenylalanine, C-Glycine, H=Histidinc. 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T— Threonine, V—Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 





KEQSMSSVGFCVINSAKRGYPLKDATHIALRTVR 
RPLEIHGETIEKVV 

3703 

A 

128 

1255 

SLGPSPKSATIPCCGDTMAPEEDAGGEALGGSFW 

EAGNYRRTVQRVEDGHRLCGDLVSCFQERARIE 

KAYAQQLADWARKWRGTVEKGPQYGTLEKAW 

HAFFTAAERLSALHLEVREKLQGQDSERVRAWQ 

RGAFHRPVLGGFRESRAAEDGFRKAQKPWLKRL 

KE VE A SKKS YHA ARKDEKTAQTRESHA KADS A 

VSQEQLRKLQERVERCAKEAEKTKAQYEQTLAE 

LHRYTPRYMEDMEQAFETCQAAERQRLLFFKD 

MLLTLHQHLDLSSSEKFHELHRDLHQGIEAASDE 

EDLRWWRSTHGPGMAMNWPQFEEWSLDTQRTI 

SRKEKGGRSPDEVTLTSIVPTRDGTAPPPQSPGSP 

GTGQDEEWSDEESP 

3704 

A 

1 

271 

ARGEDLALATGGGPDTVTHSNMPCPNSLVYDC 

WLNIKECSVGEHTFEDLGLCPGRNQREKKRSYK 

DFLREEEKIAAQVRNSSKKKLKDSE 

3705 

A 

170 

1318 

LNWANLVIMWPREEEKEKVQDYSLGGLSPDLRI 

DVSRKKKILKAYDEDEDEDLYPDIHPPPSLPLPG 

QFTCPQCRKSFTRRSFRPNLQLANMVQIIRQMCP 

TPYRGNRSNDQGMCFKHQEALKLFCEVDKEAIC 

VVCRESRSHKQHSVLPLEEVVQEYKAKLQGHVE 

PLRKHLEAVQKMKAKEERRVTELKSQMKSELA 

AVASEFGRLTRFLAEEQAGLERRLREMHEAQLG 

RAGAAASRLAEQAAQLSRLLAEAQERSQQGGLR 

LLQDIKETFNRCEEVQLQPPEVWSPDPCQPHSHD 

FLTDAIVRKMSRMFCQAARVDLTLDPDTAHPAL 

MLSPDRRGVRLAERRQEVADHPKRFSADCCVLG 

AQGFRSGRHYWEVCMGP 

3706 

A 

204 

1996 

SRERQTTWMDHNFAPAPPEMQSHGAPGPGTSFS 

HSHVLGRPIRPSRLPGGGSPLTPVLRKTIHLDTFP 

QSHIPQTSSRLGLGARTRSVPPQETGIALGASLSP 

LPTSSLVPRKLSSISLTLHQNSQARSLDRPLSHWE 

ELPTPGKKAAPHEGGRVSSPGSPPVTLVPGGRVH 

SEGPGNPGLTKSNRMLATEKPLVSSYLALPFQSR 

LAQSAPVLAEPGSLGQGHLVSVTDHMPTRASPG 

KGKPRARGIPRPRGRLQRANTTVNLTAMDTRTD 

AARHLATMATNRPSLAINLATPNTSQLDTGTEFP 

ALD1KLGTARDLSSVGTVKSGKTVNLATAGTIKP 

GTAMNLTTVGTTKPGMVMDLIASEPDKLGKAM 

ATRSTAKPDMTTEG1AMDSATSDPVKPDTITATV 

GTSRLETAMALARVNRAKLGTAKNSLALDTSR 

MGTAVGSVVPVTPDPATGKTTLGSVTSFNLTISDV 

ATCLLMPSRSTDLALDNTNAAMDRATEPASLDL 

ATEYKGKCRNLVGDGLGCREGEVCELGDGSMK 

PMSINSNLLGYIGIDTimQMRKKTMKTGFDFNIM 

WGTEGCGAAAGLVAGSTKDPISFPQ 

3707 

A 

3 

549 

SSSISRDFLGQAACASGTMLRWLRDFVLPTAACQ 
DAEQPMRYETLFQALDRNGDGVVDIGELQEGLR 

LKIDrffiKiCMKLAFKSLD 

TLGLTISEQQAELILQSIDVDGTMTVDWNEWRD 
YFLFNPVTDIEEIIR 

3708 

A 

1 

1866 

EFRGAGRANNI1,APRGAAVLLLEDLVLQRWLAAG 
AQATPQVFDLLPSSSQRLNPGALLPVLTDPALND 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
EMJlutamic Acid, F=Phcnylalanine, G=Glycine, H=Histidine, 
I=Isolcucine, K=Lysinc, L=Lcucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutaminc, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W-Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 





LYVISTFKLQTKSSAT1FGLYSSTDNSKYFEFTVM 

GRLSKAILRYLKNDGKVHLVVFNNLQLADGRRH 

RILLRLSNLQRGAGSLELYLDC1QVDSVHNLPRA 

FAGPSQKPETIELRTFQRKPQDFLEELKLVVRGSL 

FQVASLQDCFLQQSEPLAATGTGDFNRQFLGQM 

TQLNQLLGEVKDLLRQEVNETSFLRNTITECQAC 

GPLKFQSPTPSTVVPPASPAPPTRPPRRCDSNPCF 

RGVQCTDSRDGFQCGPCPEGYTGNGITCIDVDEC 

KYHPCYPGEHCTNLSPGFRCDACPVGFTGPMVQ 

GVGISFAKSNKQVCTDIDECRNGACVPNSICVNT 

LGSYRCGPCKPGYTGDQIRGCKAERNCRNPELN 

PCSVNAQCIEERQGDVTCVCGVGWAGDGYICGK 

DVDIDSYPDEELPCSARNCKKDNCKYVPNSGQE 

DADRDGIGDACDEDADGDGILNEQDNCVLIHNV 

DQRNSDKDIFGDACDNCLSVLNNDQKDTDGDG 

RGDACDDDMDGDGIKNILDNCPKFPNRDQRDK 

DGDGVGDACDSCPDVSNPNQ 

3709 

A 

144 

417 

TQAMEGLLHYINPAHAISLLSALNEERLKGQLCD 
VLLIVGDQKFRAHKNVLAASSEYFQSLFTNKENE 
SQTVFQLDFCEPDAFDNVLNYIY 

3710 

A 

245 

688 

FGMLKNKGHSSKKDNLAVNAVALQDHILHDLQ 

LRNLSVADHSKTQVQKKENICSLKRDTKAIIDTGL 

KKTTQCPKLEDSEKEYVLDPKPPPLTLAQKLGLI 

GPPPPPLSSDEWEKVKQRSLLQGDSVQPCP1CKE 

EFELRPQVFSIRG 

3711 

A 

3 

773 

SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRS 
TPAMMNGQGSTTSSSKNIAYNCCWT)QCQACFNS 
SPDLADHIRSIHVDGQRGGVFVCLWKGCKVYNT 
PSTSQS WLQRHMLTHSGDKPFKC V VGGCNASFA 
SQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSK 
AGMNKREXLKNKRRRSLARPHDFFDAQTLDAIR 
HRAICFM.SAHIESLGKGHSVVFHSTVSILLFFQIK 
YKTLQKNISTIISKSLKI 

3712 

A 

2 

344 

RATWHNAGKEREAVQLMAGAEKRVKASHSFLR 
GLFGGNTRIEEACEMYTRAANMFKMAKNWSAA 
GNAFCQAAKLHMQLQSKHDSATSFVDAGNAYK 
KADPQGKTARHVACYLCV 

3713 

A 

20 

974 

GAAATACSSSSSSSGAPATWAAHGPGKDVASPS 

S VSLSPRRSRLL VLRCGLRRNPERPS SSPALRRLL 

LLLLLLLLLLLGFLLSPGPERGVGGGRFGRRLAL 

LWAAALGHVVSGKVMSRRAPGSRLSSGGGGGG 

TNYSRSWNDWQPRTDSASADPGNLKYSSSRDRG 

GSSSYGLQPSNSAWSRQRHDDTRVHADIQNDE 

KGGYSVNGGSGENTYGRKSLGQELRVNNVTSPE 

l^SVQHGSRALATKDMRKSQERSMSYCDESRLS 

YLLRRITRENDRDRRLATVKQLKEFIQQPENKLV 

LVKQLDILAAVHDVLNER 

3714 

A 

237 

458 

IFALKSPSYLLPCCTPEGKMDHKQLCWSHPQKSG 
QSSRSCCICSNQHGLIWKYSLNMCLQCCHQYVK 

3715 

A 

970 

1524 

LCTLSPGISGTAGSCLTTEPGTELGTSFAQNGFYH 

EAVVLFTQALKLNPQDHRLFGNRSFCHERLGQP 

AWALADAQVALTLRPGWPRGLFRLGKALMGLQ 

RFREAAAVFQETLRGGSQPDAARELRSCLLHLTL 

QGQRGGICAPPLSPGALQPLPHAELAPSGLPSLRC 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, * 
T>Threonine, V=Valine, W^Tryptophan, Y-Tyrosine, 
X=Unknown, *-Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 





PRSTALRSPGLSPLLH 

3716 

A 

85 

308 

QGLPSTMVKLGCSFSGKPGKDPGDQDGAAMDS 

VPLISPLDISQLQPPLPDQVV1KTQTEYQLSSPDQQ 

NYTKSR 

3/1 / 

A 

A 


618 

GAGCTSPGLWARKAAARCLPTYPSRAQPSNVGR 

RRRJRRPGLGALAAGVPAMAESVERLQQRVQELE 

RELAQERSLQVPRSGDGGGGRVRIEKMSSEVVD 

SNPYSRLMALKRMGI VSD YEKI RTF A V A I VG VG G 

VGSVTAEMLTRCGIGKLLLFDYDKVELANMNRL 

FFQPHQAGLSKVQAAGHTPEE 

3/18 

A 

A 

<•> 
3 

593 

RGAGGRAGGRADGQPNMADQRQRSLSTSGESL 

YHVLGLDKNATSDDIKKSYRKLALKYHPDKNPD 

NPEAADKFKEINNAHAILTDATKRNIYDKYGSLG 

LYVAEQFGEENVNTWVLSSWWAKALFVFCGLL 

TCCYCCCCLCCCFNCCCGKCKPKAPEGEETEFY 

VSPEDLEAQLQSDEREATDTPIVIQPASATEP 

3719 

A 

2 

2173 

SGGVRMGSRADGPRTSGHVTGKMAVFPWHSRN 

RNYKAEFASCRLEAVPLEFGDYHPLKPITVTESK 

TKKVNRKGSTSSTSSSSSSSVVDPLSSVLDGTDPL 

SMFAATADPAALAAAMDSSRRKRDRDDNSVVG 

SDFEPWTNKRGEILARYTTTEKLSFNLFMGSEKG 

KAGTATLAMSEKVRTRLEELDDFEEGSQKELLN 

LTQQDYVNRIEELNQSLKDAWASDQKVKAPKN 

VHPGKLVYERIFSMCVDSRSVLPDHFSPENANDT 

AKETCLNWFFKIASIRELIPRFYVEASILKCNKFLS 

KTGISECLPRLTCMIRGIGDPL\GSVYARAYL\SRV 

GMEVAPHLKETLNKNFFDFLLTFKQIHGDTVQN 

QLVVQGVELPSYLPLYPPAMDWIFQCISYHAPEA 

LLTEMMERCKKLGNNALLLNSVMSAFRAEFIAT 

RSMDFIGMIKECDESGFPKHLLFRSLGLNLALAD 

PPESDRLQILNEAWKVITKLKNPQDYINCAEVWV 

EYTCKHFTKREVNTVLADV1KHMTPDRAFEDSY 

PQLQLIIKKVIAHFHDFSVLFSVEKFLPFLDMFQK 

ESVRVEVCKCI\RTPLSSINKSPPRTRSS*MPFCMF 

ARPCMTL/CNALTLEDEKRMLSYLINGFIKMVSF 

GRDFEQQLSFYVESRSMFCNLEPVLVQLIHSVNR 

LAMETRKVMKGNHSRKTAAFVRSWGAYWFITIP 

SLAGIFTRLNLYLHSG 

3720 

A 

24 

296 

ENLFRAGFAFSLLRSSFY1SKTYCSWFSNLISGSL 

ADFwSKGTRDYSPRQNlAVRE/KVFDVIIRCFKJRH 

GAEVIDTPVFELKVRNGQEETTW 

3721 

A 

2 

310 

PSCLTCVGHCSIGGSCTMIGIMMPECHCSLHMTG 
PRCEEHVFILQQPGHIASILIPLLVLLLLALVAGVV 
FWHKRRVQGAKGFQHQRMTNGAMNVEIGNPTY 
K 

3722 

A 

75 

722 

MELVAGCYEQVLFGFAVHPEPEACGDHEQWTL 
VADFTHHAHTASLSAVAVNSRFVVTGSKDETIHI 
YDMKKKIEHGALVHHSGTITCLKFYGNRHLISGA 
EDGLICIWDAKKWECLKSIKAHKGQVTFLSIHPS 

CWl AT Q\mTV\VTT PTWMT Q A CTV\ITl''n\I A 

UJVL/VJLfO VU lUlViJLxvi Wr^ljVtiLjKoArlKlNlK^lNA 

HTVEWSPRGEQYVVIIQNKIDIYQLDTASISGTITN 
EKRISSVKFLSES 

3723 

A 

110 

316 

MELSDNRRSGGLEGLAEKCPNLTYLNLSGNKJK 
DLSTVEALVSGTVLSLDLLFLVKFSEICLCLLIS1 

3724 

A 

3 

406 

VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny lata nine, G=Glycine, H=Histidine, 
I=fsoleucine, K=Lysine, D=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknotvn, *=Stop codon, /=possible nucleotide deletion, 
V=possiblc nucleotide insertion 





VRACASLGVLSFPELEWYEESRMVSLTAPYVSG 
FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 
GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 
DO 

3725 

A 

3 

406 

VDRGTEAWQRDPAFSGLQRVGGVDVSFVKGDS 

VRACASLGVLSFPELEVVYEESRMVSLTAJPYVSG 

FLAFREVPFLLELVQQLREKEPGLMPQVLLVDGN 

GVLHHRGFGVACHLGVLTDLPCVGVAKKLLQV 

DG 

3726 

A 

1 

433 

SSDDRSLFRRLKLNYAIFDEGHMLKNMGSIRYQ 
HLMTINANNRLLLTGTPVQNNLLELMSLLNFVM 
PHMFSSSTSEIRRMFSSKTKSADEQSIYEKERIAH 
AKQIIKPFILRRVKEEVLKQLPPKKDRIELCAMSE 
KQEQLYLG 

3727 

A 

6 

383 

R1PRGKACXTVLGRSTGELEGFASSRLPPQPCGW 
GQSSDLLSRIDLDELMKKDEPPLDFPDTLEGFEY 
AF^EKGQLRHTKTGEPFVFNYREHLHRWNQKRY 
EALGEIITKYVYELLEKDCNSKKVS 

3728 

A 

3 

2452 

E1AG A A A ENMLG SLLCLPG SG S VLLDPCTG STISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSVVSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDE1THDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

PXGNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKJ3SDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AWQDSAFSYRDAKKKLRLALCSADSVAFPVL-n 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVL1KANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 

3729 

A 

3 

2452 

EIAGAAAENMLGSLLCLPGSGSVLLDPCTGST1SE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSWSGISATSEDIP 

^EDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

rTlLfcGA VGCjNbARLrNf ubrMr *LPAEMbAr KQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAEKRTSPSDGAMANYEST 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E~Glutarnic Acid, F—PhcnylaJaninc, G— Glycine, H— Histidine, 
I=Isoieucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W^Tryptophan, Y=Tyrosine, 
X=llnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possiblc nucleotide insertion 





EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

AHPQDSAFSYRDAKKKLRLALCSADSVAFPVLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLN4AQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKKIREFIQDFQK 

LTA ADDKTAQ VEDFLQFLYG AMAQD VI WQNA S 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRLSKVVTANHRALQEPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 

3730 

A 

3 

2452 

EIAGAAAENMLGSLLCLPGSGSVLLDPCTGSTISE 

TTSEAWSVEVLPSDSEAPDLKQEERLQELESCSG 

LGSTSDDTDVREVSSRPSTPGLSVVSGISATSEDIP 

NKIEDLRSECSSDFGGKDSVTSPDMDEITHDFLYI 

LQPKQHFQHIEAEADMRIQLSSSAHQLTSPPSQSE 

SLLAMFDPLSSHEGASAVVRPKVHYARPSHPPPD 

PPILEGAVGGNEARLPNFGSPMF*LPAEMEAFKQ 

RHS/YTPERLVRSRSS\DIVSSVRRPMSDPSWNRR 

P\GNEERELPPAAAIGATSLVAAPHSSSSSPSKDSS 

RGETEERKDSDDEKSDRNRPWWRKRFVSAMPK 

APIPFRKKEKQEKDKDDLGPDRFSTLTDDPSPRLS 

AQAQVAEDILDKYRNAIKRTSPSDGAMANYEST 

EVMGDGESAHDSPRDEALQNISADDLPDSASQA 

A HPQDS AFS YRDAKKKLRLALCS ADS VA FP VLT\ 

HSTRNGLPDHTDPEDNEIVCFLKVQIAEAINLQD 

KNLMAQLQETMRCVCRFDNRTCRKLLASIAEDY 

RKRAPYIAYLTRCRQGLQTTQAHLERLLQRVLR 

DKEVANRYFTTVCVRLLLESKEKK1REFIQDFQK 

LTAADDKTAQVEDFLQFLYGAMAQDVIWQNAS 

EEQLQDAQLAIERSVMNRIFKLAFYPNQDGDILR 

DQVLHEHIQRLSKVVTANHRALQIPEVYLREAP 

WPSAQSEIRTISAYKTPRDKVQCILRMCSTIMNLL 

SLANEDSVPGADDFVPVLVFVLIKANPPCLLSTV 

QYISSFYASCLSGEESYWWMQFTAAVEFIKTIDD 

RK 

3731 

A 

1 

1305 

WTAMHEAKLMEECDELVEIIQQRKQMIAVKIK 

ETKVMKLRKLAQQVANCRQCLERSTVLINQAEH 

ILKENDQARFLQSAKNIAERVAMATASSQVLIPDI 

NFNDAFENFALDFSREKKLLEGLDYLTAPNPPSIR 

EELCTASHDTITVHWISDDEFSISSYELQYTIFTGQ 

ANFISLYNSVDSWMIVPNIKQNHYTVHGLQSGTR 

YIFIVKAINQAGSRNSEPTRLKTNSQPFKLDPKMT 

HKKLKISNDGLQMEKDESSLKKSHTPERFSGTGC 

YVYGVLHNSDNS*MFISLSFPLSHRYAIGIAYKSA 

PKNEWIGKNASSWVFSRCNSNFVVRHNNKEML 

VDVPPHLKRLGVLLDYDNY/NMLSFYDPANSLXH 

LHTFDVTFVILPVCPTFTIWNKSLMILSGLPAPDFI 

DYPERQECNCRPQESPYVSGMKTCH 

3732 

A 

127 

2832 

LGQRLSLVPRPSLKRRLGKRLSLGLRERMMSLW 
WS/GPKVRTQATTGARPKTETKSVPAARPKTEAQ 
AMSGARPKTEVQVMGGARPKTEAQGITGARPKT 
DARAVGGARSKTDAKAIPGARPKDEAQAWAQS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l-lsoleucine, K=Lysine, L=Leucinc, M-Methionine, 
N=Asparagine, P=Proline, Q=G!utamine, R=Arginine, S-Serine, 
T=Threonine, V=Vaiine, VV=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possible nucleotide deletion, 
^possible nucleotide insertion 

- 




EFGTEAVSQAEGVSQTNAVAWPLATAESGSVTK 

SK\ACLWIEN*SMWM/PETFPGTQGQKGIQPWFG 

PGEETNMGSWCYSRPRAREEASNESGFWSADET 

STASSFWTGEETSVRSWPREESNTRSRHRAKHQT 

NPRSRPRSKQEAYVDSWSGSEDEASNPFSFWVG 

ENTNNLFRPRVREEANIRSKLRTNREDCFESESED 

EFYKQSWVLPGEEANXIDSGTETKKILILPWKLRA 

QKDVDSDRVKQEPRFEEEVIIGSWFWAEKEASLE 

GGASAICESEPGTEEGAIGGSAYWAEEKSSLGAV 

AREEAKPESEEEA1FGSWFWDRDEACFDLNPCPV 

YKVSDRFRDAAEELNASSRPQTWDEVTVEFKPG 

LFHGVGFRSTSPFGIPEEASEMLEAKPKNLELSPE 

GEEQESLLQPDQPSPEFTFQYDPSYRSVREIREHL 

RARESAESESWSCSCIQCELKIGSEEFEEFLLLMD 

KJRDPFIHEISKJAMGMRSASQFTRDFIRDSGVVS 

LIETLLNYPSSRVRTSFLENMIHMAPPYPNLNMIE 

TFICQVCEETLAHSVDSLEQLTGNKGCFRHLTMT 

ID YHTVLIAN * YGPGFPLLF* PQ AQCGETKFHVLK 

MLLNLSENPAVAKKLFSAKALSIFVGLFNIEETN 

DNIQIVIKMFQNISNI1KSGKMSLIDDDFSLEPLISA 

FREFEELAKQLQAQIDNQNDPEATGTTAFVGKG 

NNPSANRERLSPSVFCPGAQEAESLPARRVRGEE 

QRLLLEEVGARTADGIPEGW 

3733 

A 

2 

3274 

DVPLIRIEEDTGEIFTTGARIDREKLCAGIPRDEHC 

FYEVEVAILPDEIFRLVKIRFLIEDINDNAPLFPAT 

VINISIPENSAINSKYTLPAAVDPDVGINGVQNYE 

LIKSQNIFGLDVIETPGGDKMPQLIVQKELDREEK 

DTYVMKVKVEDGGFPQRSSTAILQVSVTDTNDN 

HPVFKETEIEVSIPENAPVGTSVTQLHATDADIGE 

N AKIHFSFSNL V SN I ARRLFHLNATTGLITIKEPLD 

REETPNHKLLVLASDGGLMPARAMVLVNVTDV 

NDNVPSID1RYIVNPVNDTVVLSENIPLNTKIALIT 

VTDKDADHNGRVTCFTDHEIPFRLRPVFSNQFLL 

ETAAYLDYESTKEYAIKLLA\ADAGKPPLNQSAM 

LFIKVKDENDNAPVFTQSFVTVSIPENNSPGIQLT 

KVSAMDADSGPNAKINYLLGPDAPPEFSLDCRT 

GMLTVVKKLDRJEKEDKYLFTILAKDNGVPPLTS 

NVTVFVSIIDQNDNSPVFTHNEYNFYVPENLPRH 

GTVGLITVTDPDYGDNSAVTLSILDENDDFTIDSQ 

TGVIRPNISFDREKQESYTFYVKAEDGGRVSRSSS 

AKVTINVVDVNDNKPVFIVPPSNCSYELVLPSTN 

PGTVVFQV1AVDNDTGMNAEVRYSIVGGNTRDL 

FAIDQETGNITLMEKCDVTDLGLHRVLVKANDL 

GQPDSLFSVVIVNLFVNESVTNATLINELVPQKH 

LKHQ*PQILEIADVSSPTSDYVKILVAAVAGTITV 

WVIFITAVVRCRQAPHLKAAQKNMQNSEWATP 

NPENRQMIMMKKXKKKKKHSPKN^ 

TKADDVDSDGNRVTLDLPIDLEEQTMGKYNWV 

TTPTTFKPDSPDLARHYKSASPQPAFQIQPETPLN 

JUIVlXrill^JCiiji J_#L/IN J r Y AY ^ J^O 1 0 IN V^OOOOO JL/I IOV uU 

CGYPVTTFEVPVSVHTRPPVDLEVGGAQSGQVAI 
LTSSLMELLLCLMVAAFLPLELRPLGQQNVMSW 
EQEAKDLLVGYWGDGEWCHFHFHHLIPGPVNPG 
YERKQYHILDSDSEDTQPSGELCPIPVRPFTELSIQ 
LLQDDGEHCGTKQGFQPAVQLGLLPHKTLK 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E-CIutamic Acid, F=PhenyIalanine, G=Glycine, H=Histidine, 
I-Isoleucine, K=Lysine, L^Leucine, M=Methioninc, 
N=Asparagine, P=Proline, Q=Glutamine, R=Argininc, S=Serinc, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 

3734 

A 

1 

840 

GTRPGHLPAPSDGFCV/HL*SIPSWGSF*GESL/EM 
QLITSLGLQEFD1ARNVLELIYAQTLVWIGIFFCPL 
LPFIQMIMLFIMFYSKNISLMMNFQPPSKAWRAS 

^MMUrlrLLrrror luVLL 1 JLAl 1 LWKl^ivrbADC 

GPFRGLPLFIHSIYSWIDTLSTRPGYLWVVWIYRN 

LIGSVHFFFILTLIVLIITYLYWQITEGRKIMIRLLH 

EQIINEGKDKMFLIEKL1KLQDMEKKANPSSLVLE 

RREVEQQGFLHLGEHDGSLDLRSRRSVQEGNPR 

A 

A 

3735 

A 

2 

432 

VEVCRRYLWKMTVDASQNVQCCV1FSHFPFIFN 

NLSKIKLLHTDTLLKIESKKHKAYLRSAA1EEERE 

SEFALRPTFDLTVRRNHLIEDVLNQLSQFENEDL 

RKELWVSFSGEIGYDLGGS/VKKEIFYCLFAEMIQ 

PEYGMFMY 

3736 

A 

1542 

343 

KGAPSFVRLYQYPNFAGPHAALANKSFFKADKV 

TMLWNKKATAVLVIASTDVDKTGASYYGEQTL 

HYIATNGESAVVQLPKNGPIYDVVWNSSSTEFCA 

VYGFMPAKATIFNLKCDPVFDFGTGPRNAAYYS 

PHGHILVLAGFGNLILQI*AD/IMKVWNVKNYKLI 

SKPVASDSTYFAWCPDGEHILTATCAPRLRVNN 

G YK1 WH YTGS1LHKYD VPSN AEL WQ VS WQPr LD 

GIFPAKTITYQAVPSEVPNEEPKVATAYRPPALRN 

KPITNSKLHEEEPPQNMKPQSGNDKPLSKTALKN 

QRKHEAKKAAKQEARSDKSPDLAPTPAPQSTPR 

NTVSQSISGDPEIDKKIKNLKKKLKAIEQLKEQAA 

TGKQLEKNQLEKIQKETALLQELEDLELGI 

3737 

A 

3190 

664 

VAMGTPRAQHPPPPQLLFL1LLSCPWIQGLPLKEE 

EILPEPGSETPTVASEALAELLHGALLRRGPEMG 

YLPGPPLGPEGGEEETTTTnTTTTVTTTVTSPVLC 

NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 

YPGYGDEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DILTCQWDLSWSAAPPACQK1MTCADPGEIANG 

HK 1 A cjiJAOr rvu ori Vl^i KtLru Y oJLfcu A A JVLL J C 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 
YQTLYKHHYQAGESLRFFCYEGFEL1GEVTITCV 
PGHPSQWTSQPPLCKVTQTTDPSRQLEGGNLAL 

ATT T Pf (IT VTVT G^nVYTYYTKT OfiK^T PfiF^fr^H 

SYSPITVESDFSNPLYEAGDTREYEVSI 

3738 

A 

3190 

664 

VAMGTPRAQHPPPPQLLFL1LLSCPWIQGLPLKEE 
EILPEPGSETPTVASEALAELLHGALLRRGPEMG 
YLPGPPLGPEGGEEETTTTI1TTTTVTTTVTSPVLC 
NNNISEGEGYVESPDLGSPVSRTLGLLDCTYSIHV 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide , 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Gtutamic Acid, F=Phenylalanine, G=Giycinc, H-Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaraine, R=Arginine, S=Serinc, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosinc, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





YPGYGIEIQVQTLNLSQEEELLVLAGGGSPGLAP 

RLLANSSMLGEGQVLRSPTNRLLLHFQSPRVPRG 

GGFRIHYQAYLLSCGFPPRPAHGDVSVTDLHPGG 

TATFHCDSGYQLQGEETLICLNGTRPSWNGETPS 

CMASCGGTIHNATLGRIVSPEPGGAVGPNLTCR 

WVIEAAEGRRLHLHFERVSLDEDNDRLMVRSGG 

SPLSPVIYDSDMDDVPERGLISDAQSLYVELLSET 

PANPLLLSLRFEAFEEDRCFAPFLAHGNVTTTDPE 

YRPGALATFSCLPGYALEPPGPPNAIECVDPTEPH 

WNDTEPACKAMCGGELSEPAGVVLSPDWPQSY 

SPGQDCVWGVHVQEEKRILLQVEILNVREGDML 

TLFDGDGPSARVLAQLRGPQPRRRLLSSGPDLTL 

QFQAPPGPPNPGLGQGFVLHFKEVPRNDTCPELP 

PPEWGWRTASHGDLIRGTVLTYQCEPGYELLGS 

DJLTCQWDLSWSAAPPACQKIMTCADPGEIANG 

HK I AbDAUrPVGSHVQYRCLPGYSLEGAAMLTC 

YSRDTGTPKWSDRVPKCALKYEPCLNPGVPENG 

YQTLYKHHYQAGESLRFFCYEGFELIGEVT1TCV 

PuHPSQ W I SQPPLCKVTQTTDPSRQLEGGNLAL 

AILLPLGLVIVLGSGVYIYYTKLQGKSLFGFSGSH 

SYSPITVESDFSNPLYEAGDTREYEVSI 

3739 

A 

734 

445 

LLEPEPAEEYTEQSEVEST/EGMIL1*CCLYFAAFQ 
TNVSNIYFALQYVNRQFMAETQFTSGEKEQVDE 
WTVETVEVRVLCIAKLLSLSSVSNFYLY 

3740 

A 

2 

1578 

MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKKWLVNITKNF 

DIGPKF1QVGVVQYSDYPVLEIPLGSYDSGEHLTA 

AVESILYLGGNTKTGKAIQFALDYLFAKSSRFLT 

KIAWLTDGKSQDDVKDAAQAARDSKITLFAIG 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EVMKQKLCEESVCPTRIPVAARDERGFDILLGLD 

VNH<XVKJKRIQLSPKKIICGYEVTSKVDLSELTSNV 

FPEGLPPSYVFVSTQRFKVKKIWDLWR1LT1DG/* 

PQIAVTLNGVDKILLFTTTSVINGSQVVTFANPQV 

K 1 LrUbu WrlQlRLL V 1 EQDV I LY1DDQQIENKPL 

HPVLGILINGQTQIGKYSGKEETVQFDVQKLRIY 

CDPEQNNRETACEIPGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL, 

PGYKGEPGRDGDK 

3741 

A 

5048 

1236 

MSAPAGSSHPAASARIPPKFGGSAVSGAAAPAGP 

GAGPAPHQQNGPAQNQMQVPSGYGLHHQNYIA 

PSGHYSQGPGKMTSLPLDTQCGDYYSALYTVPT 

QNVTPNTVNQQPGAQQLYSRGPPAPHIVGSTLGS 

FQGAASSASHLHTSASQPYSSFVNHYNSPAMYS 

ASSSVASQGFPSTCGHYAMSTVSNAAYPSVSYPS 

LPAGDTYGQMFTSQNAPTVRPVKDNSFSGQNTA 

ISHPSPLPPLPSQQHHQQQSLSGYSTLTWSSPGLP 

STQDlvn.nuraTGSLAVANlWTITVADSLSCPVM 

V^lN V v^rx JVooIr V Vol Y JjOUDoVJaoo 1 Ivl r i l/\lNrjJr V 

EPVTSVTQPSELLQQKGVQYGEYVNNQASSAPT 
PLSSTSDDEEEEEEDEEAGVDSSSTTSSASPMPNS 
YDALEG GS YPDMLSSSASSPAPDPAPEPDPA SAP 
APASAPAPVVPQPSKMAKPLAMAIQHFSLVIRML 
QHHLFLEYSPSNPVYSGFQQYPQQYPGVNQLSSS 


422 


WO 01/57190 


PCT/USO 1/04098 


SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A~Alanine C=Cysteinc, D=Aspartic Acid, 
E-Glutaniic Acid. F=PhcnyIalaninc, G=GIycine, H=Histidine, 
l=Isolcucine, K^Lysine, L=Leucine, M=Methionine, 
N=Asparaginc, P=Proline, Q=GIutamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknovvn, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 





IGGLSLQSSPQPESLRPVNLTQERNILPMTPVWAP 

VPNLNADLKKLNCSPDSFRCTLTN1PQTQALLNK 

AKLPLGLLLHPFRDLTQLPVITSNTIVRCRSCRTYI 

NP\FVSFIDQRR*KCNLCYRVNDVPEEFM\ r NPLT 

RSYGEPHKRPEVQNSVTVEFIASSDYMLRPPQPAV 

YLFVLDVSHNAVEAGYLTI/LWCQSLLEVNLDKLP 

G\DSR1MUGFMTFD\STYSFLQFTQEGLSQPQMLI 

VSDIDDVFLPTPDSLLVNLYESKELIKDLLNALPN 

MFTNTRETHSALGPALQAAFKLMSPTGGRVSVF 

QTQLPSLGAGLLQSREDPNQRSSTKVVQHLGPAT 

DFYKKLALDCSGQQTAVDLFLLSSQYSDLASLA 

CMSKYSAGCIYYYPSFHYTHNPSQAEKLQKDLK 

RYLTRKIGFEAVMRJRCTKGLSMHTFHGNFFVRS 

TDLLSLANINPDAGFAVQLSIEESLTDTSLVCFQT 

ALLYTSSKGERRJRVHTLCLPVVSSLSDVYAGVD 

VQAAICLLANMAVDRSVSSSLSDARDALVNAVV 

DSLSAYGSTVSNLQHSALMAPSSLKLFPLYVLAL 

LKQKAFRTGTSTRLDDRVYAMCQIKSQPLVHLM 

KM1HPNLYRIDRLTDEGAVHVNDRTVPQPPLQKL 

SAEKLTREGAFLMDCGSVFYIWVGKGCDNNFIE 

DVLGYTNFASIPQKMTHLPELDTLSSERARSFIT 

WLRDSRPLSPILHIVKDESPAKAEFFQHLIEDRTE 

AAFSYYEFLLHVQQQICK 

3742 

A 

934 

68 

SMLASQGVLLHPYGVPMIVPAAPYLPGLIQGNQE 

AAAAPDTMAQPYASAQFAPPQNG1PAEYTAPHP 

HPAPEYTGQTTVPEHTLNLYPPAQTHSEQSPADT 

S AQTVSGTRNKQD*RSTDG WPSPKTQTS * KHGK 

QVSSPSGLHVSNIPFR\FRDPDLRQMF\GQFGKILD 

VEIIFNERGSKGFGFVTFENSADADRAREK\LHGT 

VVXEGRKIXEVMNATAJRVMTNKKTVNPYTNGWK 

LNPVVGAVYSPEFYAGTVLLCQANQEGSSMYSA 

PSTDFRGAKLHTSRPLLSGS 

3743 

A 

3 

1456 

QFQQAWMQNKVPIPAPNEVLNDRXEDIKLEEKK 

KTQAEIEQEMATLQYTNPQLLEQLKIERLAQKQV 

EQIQPPPSSGTPLLGPQPFPGQGPMSQIPQGF/PTA 

PSISADANEHGSVKGPPGPQGQFRPPGPQGQMGP 

QGPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQD 

MHGPQGMQRHPGPHGPLGPQGPPGPQGS SGPQG 

HMGPQGPPGPQGHIGPQGPPGPQGHLGPQGPPGT 

QGMQGPPGPRGMQGPPHPHGIQGGPGSQGIQGP 

VSQGPLMGLNPKGMQGPPGPRENQGPAPQGMI 

MGHPPQEMRGPHPPGGLLGHGPQEMRGPQEIRG 

MQGPPPQGSMLGPPQELRGPPGSQSQQGPPQGSL 

GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQ 

QQKTPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQ 

GAQGRIPPLNPGQGPGPNKVS/ERGAPPRHEGRA 

PPRGRDGFPGPMKTLV 

3744 

A 

1571 

652 

PLTGRKCPGWTHSGSRRSPRIAEEVPGFPKRAEA 

SRQFSETADRLELLRRAVMAAARATTPADGEEP 

APEAEALAAARERSSRFLbOLbL VKvjuAbAKVr K 

GRFQGRAAVIKHRFPKGYRHPALEARLGRRRTV 

QEARALLRCRRAGISAPVVFFVDYASNCLYMEEI 

EGSVTVRDMFSPLWRLKKTPQGLSNLAKTIGQVL 

ARMHDEDLIHGDLTTSNMLLKPPLEQLNIVLIDF 

GLSFISALPEDKGVDLYVLEKAFLSTHPNTETVFE 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

currcspunuiiig 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 

trk lief o m inn 

acid residue of 

peptide 

sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenylaIanine, G=GIycine, H-Histidine, 
I=Isoleucine, K=Lysine, L/=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 

T=Thr*»Anin^ V=Valine W=Trvntnnh!»n V=Tvrftcini> 
■ — i ill cow lie, v — v dune, r y I — ljrQSinCj 

X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





AFLKSYSTSSKKj\RPVLKKLDEVRLRGKKRSMV 
G 

3745 

A 

127 

1433 

GSHRFSLASPLDPEVGPYCDTPTMRTLFNLLWLA 

LACSPVHTTLSKSDAKKAASKTLLEKSQFSDKPV 

QDRGLVVTDLKAESVVLEHRSYCSAKARDRHFA 

GDVLGYVTPWNSHGYDVTKVFGSKFTQ1SPVWL 

QLKRRGREMFEVTGLHDVDQGWMRAVRKHAK 

GL\P*CLGSCLRTGLTMISG/YVLDSEDEIEELSKT 

VVQVAKNQHFDGFVVEVWNQLLSQKRVGLIHM 

LTHLAEALHQARLLALLVIPPAITPGTDQLGMFT 

HKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPL 

SWVRACVQVLDPKSKWRSKILLGLNFYGMDYA 

TSKDAREPVVGARYIQTLKDHRPRMVWDSQVSE 

HFFEYKKSRSGRHVVFYPTLKSLQVRLELARELG 

VGVSIWELGQGLDYFYDLL*VGIAASAVDVFFSK 

PWSE 

3746 

A 

1 

898 

IDRA AECRTKPLPMA V SIRGNADSIVACLVLM VL 

YLIKKRLVACAAVFYGFAVHMKJYPETYILPITL 

HLLPDRDNDKSLRQFRYTFQACL*ELLKRLCNRT 

ALMFVAVAGLTFFALSFGFYYEYGWEFLEHTYP 

YHLTRRDIRHNFSPYFYMLYLTAESKWSFSLGIA 

AFLPQLILLSAVSFAYYRDLVFCWFLHTSIFVTFN 

KVCTSQYFLWYLCLLPLVMPLVRMPWKRAWL 

LMLWFIGQAMWLAPAYVLEFQGKNTFLFIWLA 

GLFFLLINCSILIQIISHYKEEPLTERIKYD 

3747 

A 

1 

2325 

MVISFQGLVTFGDVAVDFSQEEWEWLNPIQRNL 

YRKVMLENYRNLASLGLCVSKPDVISSLEQGKEP 

WTVKRKMTRAWCPDLKAVWXIKELPLKKDFCE 

GKLSQAVITERLTSYNLEYSLLGEHWDYDALFET 

QPGLVTIKNLAVDFRQQLHPAQKNFCKNGIWEN 

NSDLGSAGHCVAKPDLVSLLEQEKEPWMVKREL 

TGSLFSGQRSVHETQELFPKQDSYAEGVTDRTSM 

TKLDCS SFREN WDSD YVFGRKL A VGQETQFRQE 

PITHNKTLSKERERTYNKSGRWFYLDDSEEKVH 

NRDSIKNFQKSSVVIKQTGIYAGKKLFKCNECKK 

TFTQSSSLTVHQRIHTGEKPYKCNECGKAFSDGS 

SFARHQRCHTGKKPYECIECGKAFIQNTSLIRHW 

RYYHTGEKPFDCIDCGKAFSDHIGLNQHRRIHTG 

EKPYKCDVCHKSF\RYGSSLTVHQRIHTGEKPYE 

CDVCRKAFSHHASLT\Q\HQRVHSGEKPFKCKEC 

GKAFRQNIHLASHLRIHTGEKPFECAECGKSFSIS 

SQLATHQRIHTGEKPYECKVCSKAFTQKAHLAQ 

HQKTHTGEKPYECKECGKAFSQTTHLIQHQRVH 

TGEKPYKCMECGKAFGDNSSCTQHQRLHTGQRP 

YECIECGKAFKTKSSLICHRRSHTGEKPYECSVC 

GKAFSHRQSLSVHQRIHSGKKPYECKECRKTFIQI 

GHLNQHKRVHTGERSYNYKKSRKVFRQTAHLA 

HHQRIHTGESSTCPSLPSTSNPVDLFPKFLWNPSS 

LPSP 

•J /HO 

A 

A. 


1 

OVjY lKouYUoALiUJr VJrJrliJLri Vv^JLrOKVrL.V Ivj 

GNSGIGKATALEIAKRGGTVHLVCRDQAPAEDA 

RGEIIRE\SGNQNIFLHIVDLSDPKKIWKFVENFKQ 

EHKiHVL\\^AGCMWKREAHKXMDFEKNFG 

CQYSGVCTFLTTRPDPLCWRKNTDPRVITWSSG 

GMLVQKLNNQ* SP VRKNTI WMGTMV YAQNK VS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteinc, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
1-Isoleucine, K=Lysine, L= Leucine, M=Methioninc, 
N=Asparagine, P=Proline, Q=Gfutamine, R=Argininc, S=Serine, 
T=Threonine, V=Vatine,\V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 





ERQQVVLTJERWGPRAPGMHFSSMHPGWAVDTPG 
VRQAMPGFHVQASGYRLRSEAQGADTMLWLAL 
SSARSRTAQRP 

3749 

A 

1939 

715 

GFLRLSQAT\RQRLSIPVMVLTLDPTRD\QCFGDR 

FSRLLLDEFLGYDDIUMSSVKGLAENEENKGFLR 

NVVSGEHYRFV\SMWMART\SYLAAFANHGQSF 

TLSVSHACCGYSHHQIFVFIVDLLQMLEMNMAIA 

FPAAPLLTVILALVGMEAIMSEFFNDTTTAFY1ILI 

VWLADQYDAICCHTSTSKRHWLRFFYLYHFAFY 

AYHYRFNGQYSSLALVTSWLFIQHSMIYFFHHYE 

T PATT fYI4VRTri\FMf T OAPTI HPHTPTAM PHnUN 

NNSGAPATAP\DSAGQPPALGPVSPGASGSPGPV 

AAAPSSLVAAAASVAAAAGGDLGWMAETAAUT 

DASFLSGLSASLLERRPASPLGPAGGLPHAPQDS 

VPPSDSAASDTTPLGAAVGGPSPASMAPTEAPSE 

VGS 

3750 

A 

2 

844 

GLLEPFSKLLSFVIQNAVFTLAYLVELCGLCYRA 
FTKERDKFYLSRSVVLELLQALKLKSPLPDTNLL 
LLVQFICADAGTKLAESTILSKQMIASVPGCGTA 

AMJc.CVK.V^Y llNJlVJLlJrMvAUMri 1 L, 1 KLRorlMJV 1 

SQPLHEDTFGGHLKVGLAQIAAMD1SRGNHRDN 
KAVIRYLPWLYHPPSAMQQGPKEFIECVSHIRLL 
SWLLLGSLTHNAVC/LKWPPLPGLP1PLDAGSHV 
ADHLIVILIGFPEQSKTSVL\HMCSLFHAF\SLAQL 

3751 

A 

431 

2 

AFTRKCEETAFIVPQCEHPTE/WVCRR1PTGSSLER 
NPGVKEGCEFCPPKVEMFFKDDANHDPQWSRQ 
QLIAAKFGFAALGI/QTEVDIMSHAT*AVFEIPEKS 
RL\PQNCTPVDMKIEFG VH VTSKE1LTD V1DNDS * 
RHSPS 

3752 

A 

131 

1278 

AWSGSGLLVLCINTASMPMISVLGKMFLWQREG 

PGGRWTCQTSRRVSSDPAWAVEW1ELPRGLSLSS 

LGSARTLRGWSRSSRPSSVDSQDLPEVNVGDTV 

AMLPKSRRALTIQEIAALARSSLHGISQWKDHV 

TKFTAMAQGRVAHLIEWKGWSKPSDSPAALESA 

FSSYSDLSEGEQEARFAAGVAEQFAIAEAKLRA 

wQWTYiFnQTnnQvnFnF a rifrMnTnM A fiOi pt 

Woo V U\Jl-iL/0 1 UUO I LtLZLJr I\\J\JlvLLJ 1 JLJAVl/WjV^l-rrJL. 

GPHLQDLFTGHRFSRPVRQGSVEPESDCSQTVSP 

DTLCSSLCSLEDGLLGSPARLAVPSCWAMSCFSPN 

CPPAGKVPSAAW/APLEAQDSLYNSPLTESCLSP 

AEEEPAPCKDCQPLCPPLTGSWERQRQASDLASS 

GVVSLDEDEAEPEEQ 

3753 

A 

3 

1138 

YYSSVRQRVTCEEPRFRECAAALiEGSATEVYAG 

EWRADRRSGFGVSQRSNGLRYEGEWLGNRRHG 

YGRTTRPDGSREEGKYKRNRLVHGGRVRSLLPL 

ALRRGKVKEKVDRAVEGARRAVSAARQRQEIA 

AARAADALLKAVAASSVAEKAVEAARMAKLIA 

QDLQPMLEAPGRRPRQDSEGSDTEPLDEDSPGV 

VPTJnT TDCUnCORT DCQDA QQP OPWP PP A PP QPT P 

PGGDOGPFSSPKAWPEEWGGAGAOAEELAGYE 

AEDEAGMQGPGPRDGSPLLGGCSDSSGSLREEE 

GEDEEPLPPLRAPAGTEPEPIAMLVLRGSSSRGPD 

AGCLTEELGEPAATERPAQPGAANPLVVGAVAL 

LDLSLAFLFSQLLT 

3754 

A 

2 

3338 

SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=f»lijtamic Acid P—Phenvln)nninp f^— ril vrin>» H— HJcfJrj;™^ 

v»iu*«fi**w niwU| a i itii j nine, vjt v'j f liCj km — 'illoiftliilCf 

I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q-Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /=possibIe nucleotide deletion, 
\=possible nucleotide insertion 





EDSEllKVVKMLLI^LEDIGsfGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGmU.GAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

VVKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPPILPILLPPVMACVADSFYKIA 

AEALVVLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNE1TRLPAIKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 

EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGVVAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTV1TAVKFLISDQPHPIDPLLK 

SFIAVHNKPSLVRDLLDDILPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLN1WEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLIEPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTSAPSTDSMELS 

3755 

A 

2 

3338 

SSLLEKMTSSDKDFRFMATSDLMSELQKDSIQLD 

EDSERKVVKMLLRLLEDKNGEVQNLAVKWLGV 

PLGAFHASLLHCLLPQLSSPRLAVRKRAVGALGH 

LATACSTDLFVELADHLLDRLPGPRVPTSPTAIRT 

LIQCLGSVGRQAGHRLGAHLDRLVPLVEDFCNL 

DDDELRESCLQAFEAFLRKCPKEMGPHVPNVTS 

LCLQYIKHDPNYNYDSDEDEEQMETEDSEFSEQE 

SEDEYSDDDDMSWKVRRAAAKCIAALISSRPDL 

LPDFHCTLAPVLIRRFKEREENVKADVFTAYIVL 

LRQTRPPKGWLEAMEEPTQTGSNLHMLRGQVPL 

VVKALQRQLKDRSVRARQGCFSLLTELAGVLPG 

SLAEHMPVLVSGIIFSLADRSSSSTIRMDALAFLQ 

GLLGTEPAEAFHPHLPILLPPVMACVADSFYKIA 

AEALVVLQELVRALWPLHRPRMLDPEPYVGEMS 

AVTLARLRATDLDQEVKERAISCMGHLVGHLGD 

RLGDDLEPTLLLLLDRLRNEITRLPAIKALTLVAV 

SPLQLDLQPILAEALHILASFLRKNQRALRLATLA 

ALDALAQSQGLSLPPSAVQAVLAELPALVNESD 

MHVAQLAVDFLATVTQAQPASLVEVSGPVLSEL 

LRLLRSPLLPAGVLAAAEGFLQALVGTRPPCVDY 

AKLISLLTAPVYEQAVDGGPGLHKQVFHSLARC 

VAALSAACPQ\EAESTASRLVCDARSPHSSTGVK 

VLAFLSLAEVGQVAGPGHERELKAVLLEALGSPS 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=A!anine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G = Glycine, H— Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Methtonine, 
N=Asparagine, P=Pro!ine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 





EDVRAAASYALGRVGAGSLPDFLPFLLEQIEAEP 

RRQYLLLHSLKEALGAAQPDSLKPYAEDIWALL 

FQRCEGAEEGTRGVVAECIGKLVLVNPSFLLPRL 

RKQLAAGRPHTRSTVITAVKFLISDQPHPIDPLLK 

SFIAVHNKPSLVRDLLDDfLPLLYQETKIRRDLIRE 

VEMGPFKHTVDDGLDVRKAAFECMYSLLESCLG 

QLDICEFLNHVEDGLKDHYDIRMLTFIMVARLAT 

LCPAPVLQRVDRLffiPLRATCTAKVKAGSVKQEF 

EKQDELKRSAMRAVAALLTIPEVGKSPIMADFSS 

QIRSNPELAALFESIQKDSTSAPSTDSMELS 

3756 

A 

112 

1361 

SLEEQQGRHPSFAPKCASQILGRIMTLITEQLQK 

QTLDELKCTRFSISLPLPDHADISNCGNSFQLVSE 

GASWRGLPHCSCAEFQ/DQPQLQLPSLRPEPAPQ 

TTVHRGNSPKEQPFSQVLRPEPPDPEKLPVPPAPPS 

KRHCRSLSVPVDLSRWQPVWRPAPSKLWTPIKH 

RGSGGGGGPQVPHQSPPKRVSSL/SVPPSSQCLFS 

MCPSSHTLQPSFLQPGPGP\DSSRPCAASPQSGSW 

ESDAESLSPCPPQRRFSLSPSLGPQASRFLPSARSS 

PASSPELPWRPRGLRNLPRSRSQPCDLDARKTGV 

KRRHEEDPRRLRPSLDFDKMNQKPYSGGLCLQE 

TAREGSSISPPWFMACSPPPLSASCSPTGGSSQVL 

SESEEEEEGAVRWGRQALSKRTLCQRDFGDLDL 

NLIEEN 

3757 

A 

413 

1 

PKPMLQQDFT/SLPDQGLDHIAE/NSYFDARSLCA 
AELVCKEWQQVTSE*MLWKKLIERMVHAYPLW 
KGLSEKVW/DQHLFKNRPTDGPPNSFHRSLYPKII 
QV1ETIESNWQCG*HTLQRIQCHSEKSKGVYCLQ 
YDDEK 

3758 

A 

2 

613 

FVSGSPWRMDGSTERLEARRPAGRLPWSSRQEM 

TRRPSLMAGRQHGWSAQQSATVANPVPGANPD 

LLPrffLGEPEDVYIVKNKPVLLVCKAVPATQIFF 

KCNGEWVRQVDHVIERSTDGSSGLPTMEVRINV 

SRQQVEKVFGLEEYWCQCVAWSSSGTTKSQKA 

Y1R1AYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 

PPAE 

3759 

A . 

1 

561 

ADDTLHLWNLRQKRPAILHSLKFCRERVTFCHLP 

FQSKWLYVGTERGNIHIVNVESFTLSGYVIMAVN 

KAIELSSKSHPGPWfflSDNPMDEGKLLIGFESGT 

VVLWDLKSKKADYRYTYDEAIHSVAWHHEGKQ 

FICSHSDGTLTIWNVRSPAKPVQTITPHGKQLKD 

GKKPEPCKPILKVEFXTTR 

3760 

A 

1 

824 

LPACRCGCVAGCPSNHGICRCLRASERQVCVMH 

LKHLRTLLSPQDGAAKVTCMAWSQNNAKFAVC 

TVDRVVLLYDEHGERRDKFSTKPADMKYGRKS 

YMVKGMAFSPDSTKIAIGQTDNITYVYKIGEDWG 

DKKVICNKFIQTVKFRPVPGTLG*TNIYQYIYL*IQ 

PGVAFLTSECDFSYCKDGASWLFMVICCLP*SPA 

VSFPIGD*\SAVTCLQWPAEYIIVFGLAEGKVRLS 

NTKTNKSSTIYGTESYVVSLTTNCSGKGILSGHA 

DGYQR 

3761 

A 

2253 

320 

PVIQRCSQPYGFSLLISFFLKCVSETSQQPPSRKVF 
QLLPSFPTLTRSKSHESQLGNR1DDVSSMRFDLSH 
GSPQMVRKDIGLSVTHRFSTKSWLSQVCHVCQK 
SMIFGVKCKHCRLKCHNKCTKEAPACRISFLPLT 
RLRRTESVPSDrNNPVDRAAEPHFGTLPKALTKK 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny!alanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucine, M=Meth ion in e, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=VaKne, W=Tryptophan, V=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





EHPPAMNHLDSSSNPSSTTFSTPSSPAPFPTSSNPS 

SATTPPVNPSP\GQR\DSRPNFPSC/AYF1HHR\Q\QFI 

FPDISAFAHAAPLPEAADGTRLDDQPKADVLEAH 

EAEAEEPEAGKSEAEDDEDEVDDLPSSRRPWRG 

PISRKASQTSVYLQEWDIPFEQVELGEPIGQGRW 

GRVHRGRWHGEVAIRLLEMDGHNQDHLKLFKK 

EVMNYRQTRHENVVLFMGACMNPPHLAIITSFC 

KGRTLHSFVRDPKTSLDINKTRQIAQEIIKGMGYL 

HAKGIVHKDLKSRNVFYDNG\KVVITDFGLF\GIS 

GWPVEGRRENQLKLSHDWLCYLAPEIVREMTPG 

KDEDQLPFSKAADVYAFGTVWYELQARDWPLK 

NQAAEASIWQIGSGEGMKRVLTSVSLGKEVSEN 

LSACWAFDLQERPSXFSLLMDMLEKLPKLNRRLS 

HPGHF*KSADINSSKWPRFERFGLGVLESSNPK 

M 

3762 

A 

2 

1578 

MAHYITFLCMVLVLLLQNSVLAEDGEVRSSCRT 

APTDLVFILDGSYSVGPENFEIVKKWLVNITKNF 

DIGPKFIQVGVVQYSDYPVLEIPLGSYDSGEHLTA 

A VESIL YLG GNTKTGKA IQFALD YLFA KSS RFLT 

KIAVVLTDGKSQDDVKDAAQAARDSKJTLFA1G 

VGSETEDAELRAIANKPSSTYVFYVEDYIAISKIR 

EV1VIKQKLCEESVCPTR1PVAARDERGFDILLGLD 

VNKKVKKRIQLSPKKIKGYEVTSKVDLSELTSNV 

FPEGLPPSYVFVSTQRFKVKKIWDLWRJLTIDG/* 

PQIAVTLNGVDKILLFTTTSVINGSQWTFANPQV 

KTLFDEG WHQ1RLL VTEQDVTL YIDDQQIENKPL 

HPVLGILINGQTQfGKYSGKEETVQFDVQKLRJY 

CDPEQNNRETACE1PGFCLNGPSDVGSTPAPCICP 

PGKPGLQGPKGDPGLPGNPGYPGQPGQDGKPVS 

TESLVISGISGITGYQGIAGTPGVPGSPGIQGARGL 

PGYKGEPGRDGDK 

3763 

A 

3 

1267 

CKVWRNPLNLFRGAEYNRYTWVTGREPLTYYD 

MNLSAQDHQTFFTCDSDHLRPADAIMQKAWRE 

RNPQARISAAHEALEINECATAYILLAEEEATTIA 

EAEKLFKQALKAGDGCYRRSQQLQHHGSQYEA 

QHSVLYLPLQ\TRHQCLGVHQKKASNVCQKTRE 

DQGSSENDERFNEGVPPSEYVQYP*KPRKALLEL 

QAYADVQAVLAKYDDISLPKSATICYTAALLKA 

RAVSDKFSPEAASRRGLSTAEMNAVEAIHRAVEF 

NPHVPKYLLEMKSLILPPEHILKRGDSEAIAYAFF 

HLAHWKRVEGALNLLHCTWEGTFRMIPYPLEKG 

HLFYPYPICTETADRELLPSFHEVSVYPKKELPFFI 

LFTAGLCSFTAMLALLTHQFPELMGVFAKAVSV 

CLEGGLGEWMGKAKGIKAA 

3764 

A 

25 

1032 

RSADGLCGNKDRERGNEFTRNQQAAQEVVNPK 

KKMKKKKYVNSGTVTLLSFAVESECTFLDYIKG 

GTQINFTVAIDFTASNGNPSQSTSLHYMSPYQLN 

AYALALTAVGEIIQHYDSDKMFPALGFGAKLPPD 

GRVSHEFPLNGNQENPSCCGIDGILEAYHRSLRT 

V^LYOr i Nr Ar V VTHVARNAAAVQDGSQYSVL 

LnTDGVISDKUQTKEAIVNG\SKLPMSIIIVGVGQ 

AEFNAMVELDGDDVR1SSRGKLAERDIVQFVPFR 

DYVDRTGNHVLSMARLARDVLAEIPDQLVSYM 

KAQGIRPRSPPAAPTHSPSQSPARTPPACPLHTHI 

3765 

A 

172 

3456 

LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLG 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alaninc C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F-Phcnylalanine, G=Glycine, H-Histidine, 
I=lsoleucine, KHLysine, L=Leucinc, iVl=Methiomne, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





KNFDSAKVPSDEYCPACKEKGKLKALKTYRISFQ 

ES1FLCEDLQCIYPLGSKSLNNLISPDLEECHTPHK 

PQKRKSLESSYKDSLLLANSKKTRNYIAIDGGKV 

LNSKHNGEVYDETSSNLPDSSGQQNPIRTADSLE 

RNEILEADTVDMATTKDPATVDVSGTGRPSPQN 

EGCTSKLEMPLESKCTSFPQALCVQWKNAYALC 

WLDCILSALVHSEELKNTVTGLCSKEESIFWRLL 

TKYNQANTLLYTSQLSGVKDGDCKKLTSETFAEI 

ETCLNEVRDEIFISLQPQLRCTLGDMESPVFAFPL 

LLKLETHIEKLFLYSFSWDFECSQCGHQYQNRH 

MKSLVTFTNVIPEWHPLNAAHFGPCNNCNSKSQI 

RKMVLEKVSPIFMLHFVEGLPQNDLQHYAFHFE 

GCLYQITSVIQYRANNHFITWILDADGSWLECDD 

LKGPCSERHKKFEVPASEIHIVIWERKISQVTDKE 

AACLPLKKTNDQHALSNEKPVSLTSCSVGDAAS 

AETASVTHPKDISVAPRTLSQDTAVTHGDHLLSG 

PKGLVDNILPLTLEETIQKTASVSQLNSEAFLXLEN 

KPVAENTGILKTNTLLSQESLMASSVSAPCNEKLI 

QDQFVDISFPSQVVNTNMQSVQLNTEDTVNTKS 

VNNTDATGLIQGVKSVEIEKDAQLKQFLTPKTEQ 

LKPERVTSQVSNLKKKETTADSQTTTSKSLQNQS 

LKENQKKPFVGSWVKGLISRGASFMPLCVSAHN 

RNTITDLQPSVKGVNNFGGFKTKGINQKASHVSK 

KARKSASKPPPISKPPAGPPSSNGTAAHPHAHAA 

SEVLEKSGSTSCGAQLNHSSYGNGISSANHEDLV 

EGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRT 

VRSENLEQVPQDGSPNDCESIEDLLNELPYPID1A 

NESACTTVPGVSLYSSQTHEEILAELLSPTPVSTE 

LSENGEGDFRYLGMGDSHIPPPVPSEFNDVSQNT 

HLRQDHNYCSPTKKNPCEVQPDSLTNNACVRTL 

NLESPMKTDIFDEFFSSSALNALANDTLDLPHFDE 

YLFENY 

3766 

A 

3 

1622 

AQQIVYR>rVMLENYKNLVSLGYQLTKPDVILRL 

EKGEEPWLVEREIHQETHPDSETAFEIKSSVSSRSI 

FKDKQSCDIKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLGISKGIHREKPYECK 

ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 

SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 

VTHQRTHTGDKLYTCNQCGKSFA^HSSRLIRHQR 

1 HTGEKP YECPECGKSFRQSTHLILHQR I HVR VR 

PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHLYSHQRTHTGEKPYECHDCGKSFS 

QSSALIVHQRIHTGEKPYECCQCGKAFIRKNDLIK 

HQRIHVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 

3767 

A 

3 

1622 

AQQIVYRNVMLENYKNLVSLGYQLTKPDVILRL 

EKGEEPWLVER£IHQETHPDbETArEIKSSVSSRSI 

FKDKQSCDIKMEGMARNDLWYLSLEEVWKCRD 

QLDKYQENPERHLRQVAFTQKKVLTQERVSESG 

KYGGNCLLPAQLVLREYFHKRDSHTKSLKHDLV 

LNGHQDSCASNSNECGQTFCQNIHLIQFARTHTG 

DKSYKCPDNDNSLTHGSSLG1SKGIHREKPYECK 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidinc, 
l=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^Glutamine, R=Arginine, S^Serine, 
T^hreonine^ V= Valine, W=Tryptophan, Y=Tyrosine, 
X^Unkuown, *=Stop codon, /^possible nucleotide deletion, 
^possible nucleotide insertion 





ECGKFFSWRSNLTRHQLIHTGEKPYECKECGKSF 
SRSSHLIGHQKTHTGEEPYECKECGKSFSWFSHL 
VTHQRTHTGDKLYTCNQCGKSF/VHSSRLIRHQR 

TTJT7~ , Erk r 'DV'lI?/~ , D'E'f~ , /'"*k r CCDr^CTLII II UT/^D TI_r\7T> "\/T> 

IrllOfci^YiiCrbCOKorKyb lHLiLrit^Kl HVKVR 

PYECNECGKSYSQRSHLVVHHRIHTGLKPFECKD 

CGKCFSRSSHL YSHQRTHTGEKP YECHDCG KSFS 

QSSALIVHQRJHTGEKPYECCQCGKAFIRKNDLIK 

HQR1HVGEETYKCNQCGIIFSQNSPFIVHQIAHTG 

EQFLTCNQCGTALVNTSNLIGYQTNHIRENAY 

3768 

A 

185 

2258 

SIIIKMSRKISKESKKVNISSSLESEDISLETTVPTD 

D1SSSEEREGKVR1TRQLIERKELLHNIQLLK1ELS 

QKTMMIDNLKVDYLTKIEELEEKLNDALHQKQL 

LTLRLDNQLAFQQKDASKYQELMKQEMETILLR 

QKQLEETNLQLREKAGDVRRSLRDFELTEEQYIK 

LKAFPEDQLSIPEYVSVRFYELVNPLRKEICELQV 

KKNILAEELSTNKNQLKQLTETYEEDRKNYSEV 

QIRCQRLALELADTKQLIQQGDYRQENYDKVKS 

ERDALEQEVIELRRKHEILEASHM1QTKERSELSK 

EVVTLEQTVTLLQKDKEYLNRQNMELSVRCAHE 

EDRLERLQAQLEESKKAREEMYEKYVASRDHY 

KTEYENKLHDELEQIRLKTNQE1DQLRNASREMY 

ERENRNLREARDNAVAEKERAVMAEKDALEKH 

DQLLDRYRE\LQ\LSTESKVTEFLHQSKLKSFESE 

RVQLLQEETARNLTQCQLECEKYQKKLEVLTKE 

FYSLQASSEKjRJTELQAQNSEHQARLDIYEKLEK 

ELDEIIMQTAEIENEDEAERVLFSYGYGANVPTT 

AKRRLKQSVHLARRVLQLEKQNSLI/LKRSGTSK 

GPSNTAFTRSLTEANSLLNQTQQPYRYLIESVRQ 

RDSKIDSLTESIAQL/ERKDVSNLNKEKSALLQTN 

GIKMAL\DL\DQLLNHP 

3769 

A 

3 

2297 

DAAEFRVVADAMKVIGFKPEEIQTVYK1LAAILH 

LGNLKFVVDGDTPLIENGKVVSIIAELLSTKTDM 

VEKALLYRTVATGRDIIDKQHTEQEASYGRDAF 

AKAIYERLFCWIVTRINDIIEVKNYDTTIHGKNTV 

IGVLDIYGFEIFDNNSFEQFCINYCNEKLQQLFIQL 

VLKQEQEEYQREGIPWKHIDYFNNQIIVDLVEQQ 

HKGIM1LDDACMNVGKVTDEMFLEALNSKLGK 

HAHFSSRKLCASDKILEFDRDFRIRHYAGDVVYS 

VIGFroK^vfKDTLFQDFKRLMYNSSNPVLKhMWP 

EGKLSITEVTKRPLTAATLFKNSMIALVDNLASK 

EPYYVRCIKPNDKKSPQIFDDERCRHQVEYLGLL 

ENVRVRRAGFAFRQTYEKFLHRYKMISEFTWPN 

HDLPSDKEAVKKLIERCGFQDDVAYGKTKIFIRT 

PRTLl^LEELRAQMLIRJVLFLQKVWRGTLARMR 

YKRTKAALTIIRYYRRYKVKSYIHEVARRFHGVK 

TMRDYGKHVKWPSPPKVLRRFEEALQTIFNRWR 

ASQLIKSIPASDLPQVRAKVAAVEMLKGQRADL 

C*\ C\X> A \1/"C/~ , XT\/T A OT/"T>T^T , DrVTC/* N, 'TE , \/T)\/ AXICl fD 

OLl^KAWJbUJN YLAoJvrlJ I rl^) 1 ovj i r VrVAIsiJbLlvK 
KDKYMNVLFSCHVRKVNRFSKVEDRAIFVTDRH 
T YK MDPTK OYK VMT<T TTPT YNT THT ^V^NOKDOT 

L» X IViVJUL/r 1 IV I JV V IVJLXV 1 JUT i-> 1 IN JL» 1 VJJUO V 01NVjIXX/\^J_» 

VVFHT1<DNKDLIVCLFSKQPTHESRIGEL\VGVLV 
hOJFKSEKJlHLQVVNVTNPVQCSL^ 
TRLNQPQPDFTKNRSGFILSVPGN 1 

3770 

A 

3 

6276 

HKVAAPDVVVPTLDTVRHEALLYTWLAEHKPL 
VLCGPPGSGKTMTLFSALRALPDMEVVGLNFSS 
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SEQU) 

NO* 

Method 

Predicted 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid. F=PhenYlalanine, G—Glycine, H— Histidine, 
l=Isoleucine, K=Lysine, L=Lcucine, M^Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threoninc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, A=possible nucleotide deletion, 
\=possible nucleotide insertion 





ATTPELLLKTFDHYCEYRRTPNGVVLAPVQLGK 

WLVLFCDEINLPDMDKYGTQRVISFIRQMVEHG 

GFYRTSDQTWVKLERIQFVGACNPPTDPGRKPLS 

HRFLRHVPVVYVDYPGPASLTQIYGTFNRAMLR 

LIPSLRTYAEPLTAAMVEFYTMSQERFTQDTQPH 

YIYSPREMTRWVRGIFEALRPLETLPVEGLIRJWA 

HEALRLFQDRLVEDEERRWTDENIDTVALKHFP 

NIDREKAMSRPILYSNWLSKDYIPVDQEELRDYV 

KARLKVFYEEELDVPLVLFNEVLDHVLRJDRIFR 

QPQGHLLLIGVSGAGKTTLSRFVAWMNGLSVYQ 

IKVHRKYTGEDFDEDLRTVLRRSGCKNEK1AFIM 

DESNVLDSGFLERMNTLLANGEVPGLFEGDEYA 

TLMTQCKEGAQKEGLMLDSHEELYKWFTSQVIR 

NLrWVFTMNPSSEGLKDRAATSPALFNRCVLNW 

FGDWSTEALYQVGKEFTSKMDLEKPNYIVPDYM 

PWYDKLPQPPSHREAIVNSCVFVHQTLHQANA 

RLAKRGGRTMAITPRHYLDFINHYANLFHEKRSE 

LEEQQMHLNVGLRKIKETVDQVEELRRDLRIKS 

QELEVKNAAANDKXKKMVKDQQEAEKKKVMS 

QEIQEQLHKQQEVIADKQMSVKEDLDKVEPAVI 

EAQNAVKSIKKQHLVEVRSMANPPAAVKLALES 

ICLLLGESTTDWKQIRSI1MRENFIPTIVNFSAEE1S 

DAIREKMKKNYMSNPSYNYEIVNRASLACGPMV 

KWAIAQLNYADMLKRVEPLRNELQKLEDDAKD 

NQQKANEVEQMIRDLEASIARYKEEYAVLISEAQ 

AIKADLAAVEAKVNRSTALLKSLSAERERWEKT 

SETFKNQMST1AGDCLLSAAFIAYAGYFDQQMR 

QNLFTTWSHHLQQANIQFRTDIARTEYLSNADER 

LRWQASSLPADDLCTENAIMLKRFNRYPLIIDPS 

GQATEFIMNEYKDRKITRTSFLDDAFRKNLESAL 

RFGNPLLVQDVESYDPVLNPVLNREVRRTGGRV 

LITLGDQDIDLSPSFVIFLSTRDPTVEFPPDLCSRV 

TFVNFTVTRSSLQSQCLNEVLKAERPDVDEKRSD 

LLKLQGEFQLRLRQLEKSLLQALNEVKGRILDDD 

TIITTLENLKREAAEVTRKVEETDIVMQEVETVS 

QQYLPLSTACSSIYFTMESLKQIHFLYQYSLQFFL 

DIYHNVLYENPNLKGVTDHTQRLSIITKDLFQVA 

FNRVARGMLHQDHITFAMLLARIKLKGTVGEPT 

YDAEFQHFLRGNEIVLSAGSTPRIQGLTVEQAEA 

VVRLSCLPAFKDL1AKVQADEQFGIWLDSSSPEQ 

TVPYLWSEETPATPIGQAIHRLLLIQAFRPDRLLA 

MAHMFVSTNLGESFMSIMEQPLDLTQIVGTEVKP 

NTPVLMCSVPGYDASGHVEDLAAEQNTQITSIAI 

GSAEGFNQADKAINTAVKSGRWVMLKNVHLAP 

GWLMQLEKKLHSLQPHACFRLFLTMEINPKVPV 

NLLRAGRIFVFEPPPGVKANMLRTFSSIPVSRICK 

SPNERARLYFLLAWFHAIIQERLRYAPLGWSKKY 

EFGESDLRSACDTVDTWLDDTAKGRQNISPDKIP 

WSALKTLMAQSIYGGRVDNEFDQRLLNTFLERL 

FTTRSFDSEFKLACKVDGHKDIQMPDGIRREEFV 

qwvellpdtqtpsavlglpnnaervllttqgvd 

miskmlkmqmlededdlayaetekktrtdsts 

dgrpxawmrtlhttasnwlhlipqtlshlkrtve 

nikdplfrffe\revkmgakllq\dvrqdladv\v 

qvcegkiok:qtnylrtli\nelv\kgilp^swshy 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F-Phenyl alanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Pro!ine, Q=G!utamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=llnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





T VPAGxMT VIQ w O VP1 S ARRlvKQLQNISLAAAASG 

GAKELKNIHVCLGGLFVPEAYITATRQYVAQAN 

SWSLEELCLEVNVTTSQGATLDACSFGVTGLKL 

QGATCNNNKLSLSNAISTALPLTQLRWVKQTNT 

EKKASVVTLPVYLNFTRADLIFTVDFEIATKEDPR 

SFYERGVAVLCTE 

3771 

A 

1 

2043 

LPLLHAGFNRRFMENSSnACYNELIQIEHGEVRS 

QFKLRACNSVFTALDHCHEAIEITSDDHVIQYVN 

PAFERMMGYHKGELLGKELADLPKSDKNRADL 

LDTINTCIKKGKEWQGVYYARRKSGDSIQQHVKI 

TPVIGQGGKIRHFVSLKKLCCTTDNNKQIHKIHR 

DSGDNSQTEPHSFRYKNRRKES1DVKSISSRGSDA 

PSLQNRRYPSMARIHSMTIEAPITKVINIINAAQEN 

SPVTVAEALDRVLEILRTTELYSPQLGTKDEDPH 

TSDLVGGLMTDGLRRLSGNEYVFTKNVHQSHSH 

LAMPITINDVPPCISQLLDNEESWDFNIFELEA1TH 

KRPLVYLGLKVFSRFGVCEFLNCSETTLRAWFQ 

VIEANYHSSNAYHNSTHAADVLHATAFFLGKER 

VKGSLDQLDEVAALIAATVHDVDHPGRTNSFUC 

NAGSELAVLYNDT\AV\LESHHTALAFQ\LTVKDT 

K\CNIFKN1D/RGNHYRTLRQA1IDMVLATEMTKH 

FEH VNKFVN SINKPMA AEIEGSDCECNPAGKNFP 

ENQ1LIKRMMIKCADVANPCRPLDLCIEWAGRIS 

EEYFAQTDEEKRQGLPVVMPVFDRNTCSIPKSQI 

SF1DYF1TDMFDAWDAFAHLPALMQHLADNYKH 

WKTLDDLKCKSLRLPSDRLKPSHRGGLLTDKGH 

CESQ 

3772 

A 

1013 

50 

TLVHADGFPSLHITETCLAYREKRIGIDLVHDTVE 
HELIKEAEIIQGIMALLTRTLEEASEQIRMNRSAK 
YNLEKDLKDKFVALTIDDICFSLNNNSPNIRYSEN 
AVRIEPNSVSLEDWLDFSSTNVEKADKQRNNSL 

iv 41 T/ » t i;n\nn cat a xtvt d v Arrv\ n ;ut a rvxir t 

MLKALVDXRILSQTANYLRKQCDV VH 1 ArKNGL 

KDTKDARDQLADHLAKWMEEIASQEKNITALEK 

A1LDQEGPAKVAHTRLETRTHRPNVELCRDVAQ 

YRLMKEVQEITHNVARLKJETLA\QAQAELKGLH 

RRQLALQEEIQVKENTIYIDEVLCMQMRKSIPLR 

DG EDHG V WAG GLRPD A VC 

3773 

A 

1 

955 

AAARESERQLRLRLCVLNEILGTERDYVGTLRFL 
QSAFLHRIRQNVADSVEKGLTEENVKVLFSNIEDI 
LEVHKDFLAALEYCLHPEPQSQHELGNVFLKFK 
DKFCVYEEYCSNHEKALRLLVELNKIPTVRAFLL 

LAKRTPGBMPDHPAVQ\SALQAMKTVCSNINETK 
RQMEKLEALEAAA/QSHIEGWEGSNLTDICTQLL 
LQGTLLKISAGNIQERAFFLFDNLLVYCKRKSRV 
TG SKKSTKRTKSINGSL YIFRGRINTE VME VENVE 
DGTGSPSPSLA 

3774 

A 

4254 

2061 

ELQGDFSVPDWKSMAWCENSICVGFKRDYYLI 
RVDGKGSIKELFPTGKQLEPLVAPLADGKVAVG 

PYIIAVLPRYVEIRTFEPRLLVQSIELQRPRF1TSGG 

SNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFE 

LALQLAEMKDDSDSEKQQQIHHIKNLYAFNLFC 

QKRFDESMQVFAKJLGTDPTHVMGLYPDLLPTDY 

RKQLQYPNPLPVLSGAELEKAHLALIDYLTQKRS 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 

Amino acid sequence (A ^Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, \V=Tryptophan, Y~Tyrosine, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





QLVKiaNDSDHQSSTSPLMEGTPTlKSKKKLLQII 

DTTLLKCYLHTNVALVAPLLRLENNHCHIEESEH 

VLKJKAHKYSELIILYEKKGLHEKALQVLVDQSK 

KANSPLKGHERTVQYLQHLGTENLHLIFSYSVW 

VLRDFPEDGLKIFTEDLPEVESLPRDRVLGFLIEN 

FKGLAIPYLEHIIHVWEETGSRFHNCLIQLYCEKV 

QGLMKEYLLSFPAGKTPVPAGEEEGELGEYRQK 

LLMFLEISSYYDPGRLICDFPFDGLLEERALLLGR 

MGKHEQALFIYVHILKDTRMAEEYCHKHYDRN 

KDGNKDVYLSLLRMYLSPPSIHCLGPIKLELLEPK 

ANLQAALQVLELHHSKLDTTKALNLLPANTQIN 

DIRIFLEKVLEENAQKKRFNQVLKNLLHAEFLRV\ 

QEERILHQQVKCIITEEKVCMVCKKKIGNSAFAR 

YPNGVVVHYFCS\KEVNPADT 

3775 

A 

1832 

839 

MSRARGALCRACLALAAALAALLLLPLPLPRAP 

APARTPAPAPRAPPSRPAAPSLRPDDVFIAVKTTR 

KNHGPRLRLLLRTWMSRARQQTFIFTDGDDPELE 

LQGGDRVINTNCSAVRTRQALCCKMSVEYDKFI 

ESGRKWFCHVDDDNYVNARSLLHLLSSFSPSQD 

VYLGRPSLDHPIEATERVQGGRTVTTVKFWFAT 

GGAGFCLSRGLALKMSPWASLGSFMSTAEQVRL 

PDDCTVG Y1VEG LLG ARLLHSPLFHSHLENLQRL 

PPDTLLQQVTLSHGGPENPQNVVNVAGGFSLHQ 

DPTRFKSIHCLLYPDTDWCPRQKQGAPTSR 

3776 

A 

3 

796 

PRAKLGTRARNMAGQDAGCGRGGDDYSEDEGD 

SSVSRAAVEVFGKLKDLNCPFLEGLYITEPKTIQE 

LLCSPSEYRLEILEWMCTRVWPSLQDRFSSLKGV 

PTEVKIQEMTKLGHELMLCAPDDQELLKGCACA 

QKQLHFMDQLLDTIRSLTIGCSSCSSLMEHFEDT 

REKNEALLGELFSSPHLQMLLNPECDPWPLDMQ 

PLLNKQSDDWQWASASAKSEEEEKLAELARQLQ 

ESAAKLHALRTEYFAQHEQGAAAGAAYTSAP 

3777 

A 

3 

413 

SEED VIEGKTA VIEKRRKKRS SAG V VED/IGGEVQ 

NMLEGVGVDINKALLAKRKRLEMYTKASLRTSN 

QKIEHVWKTQQDQRQKJLNQEYSQQFLTLFQQW 

DLDMQKAEEQEEKILVGJMIRFIINQVSSRNGQPS 

LLL 

3778 

A 

132 

788 

SRLPPPPPHLADGRAGARVPRSARLSRWWVQD 

WTHGPIVRPPAAARTMWVNPEEVLLANALW1TE 

RANPYFILQRRKGHAGDGGGGGGLAGLLVGTLD 

VVLDSSARVAPYRILYQTPDSLVYWTIACGVGSR 

KEITEHWEWLEQNLLQTLSIFENENDITTFVRGKI 

QGIIAEYNKINDVKEDDDTEKFKEAIVKFHRLFG 

MPEEEKLVNYYSCSYWKG 

3779 

A 

2 

934 

CKSCTLFPQNPNLPPPSTRERPPGCKTVFVGGLPE 

NATEEIIQEVFEQCGDITAIRKSKKNFCHIRFAEEF 

MVDKAIYLSGYRMRLGSSTDKKDSGRLHVDFA 

QARDDFYEWECKQRMRAREERHRRKLEEDRLR 

PPSPPAIMHYSEHEAALLAEKLKDDSKFSEAM\Q 

V LLo W IhKufc V NKK.\L>AJN^r Y b M V l^b A W on V KKL 

MNEKATHEQEMEEAKENFKNALTGILTQFEQIV 
AVFNASTRQKAWDHFSKAQRKNIDIWAKVHSEE 
LRKAQSEQLMGIRREEEMEMSDDENCDSPTKKM 
RVDESALGAP 

3780 

A 

1 

2535 

AAQAEREELAAGRMPGGGPQGAPAAAGGGGVS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=PhenylaIanine, G=Glycine, H=Histidinc, 
I-Isoleucine, K=Lysine, L=Leucine, M-Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop co6on y /^possible nucleotide deletion, 
\=possible nucleotide insertion 





HRAGSRDCLPPAACFRRRRLARRPGYMRSSTGP 

GIGFLSPAVGTLFRFPGGVSGEESHHSESRARQC 

GLDSRGLLVRSPVSKSAAAPTVTSVRGTSAHFGI 

QLRGGTRLPDRLSWPCGPGSAGWQQEFAAMDS 

SETLDASWEAACSDGARRVRAAGSLPSAELSSNS 

CSPGCGPEVPPTPPGSHSAFTSSFSFIRLSLGSAGE 

RGEAEGCPPSREAESHCQSPQEMGAKAASLDGP 

HEDPRCLSQPFSLLATRVSADLAQAARNSSRPER 

DMHSLPDMDPGSSSSLDPSLAGCGGDGSSGSGD 

AHSWDTLLRKWEPVLRDCLLRNRRQMEVISLRL 

KLQKLQEDAVENDDYDKAETLQQRLEDLEQEKI 

SLHFQLPSRQPALSSFLGHLAAQVQAALRRGATQ 

QASGDDTHTPLRMEPRJLLEPTAQDSLHVSITRRD 

WLLQEKQQLQKEIEALQARMFVLEAKDQQLRRE 

IEEQEQQLQWQGCDLTPLVGQLSLGQLQEVSKA 

LQDTLASAGQIPFHAEPPETIRSLQERIKSLNLSLK 

EITTKVCMSEKPCSTLRKKVNDIETQLPALLEAK 

MHAISGNHFWTAKDLTEEIRSLTSDREGLEGLLS 

KLLVLSSRNVKKLGSVKEDYNRLRREVEHQETA 

YETSVKENTMKYMETLKNKLCSCKCPLLGKVW 

EADLEACRLLIQCLQLQEARGSLSVEDERQMDD 

LEGAAPP1PPRLHSEDKRKTPLKESYILSAELGEK 

CEDIGKKLLYLEDQLHTAIHSHDEDLIQSLRRELQ 

MVKETLQAMILQLQPAKEAGEREAAASCMTAG 

VHEAQA 

3781 

A 

3 

995 

GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

GAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGSVMG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

NGTLPLSHM 

3782 

A 

1 

2649 

FRVPDSCPVVLHSFTQLDPDLPRPESSTQEIGEELI 

NGVIYSISLRKVQLHHGGNKGQRWLGYENESAL 

NLYETCKVRTVKAGTLEKLVEHLVPAFQGSDLS 

YVTIFLCTYRAFTTTQQVLDLLFKRYGRCDALTA 

SSRYGCILPYSDEDGGPQDQLKNAISSILGTWLD 

QYSEDFCQPPDFPCLKQLVAYVQLNMPGSDLER 

RAHLLLAQLEHSEP1EAEPEGEEDWALSPVPALK 

PTPELELALTPARAPSPVPAPAPEPEPAPTPAPGSE 

LEVAPAPAPELQQAPEPAVGLESAPAPALELEPA 

PEQDPAPSQTLELEPAPAPVPSLQPSWPSPVVAEN 

GLSEEKPHLLVFPPDLVAEQFTLMDAELFKKWP 

YHCLGSIWSQRDKKGKEHLAPTIRATVTQFNSV 

ANCVITTCLGNRSTKAPDRARVVEHWIEVAREC 

RTT TOJF^T YATI *5AT nSTNI^n-TRT Ic~k r TWFTWQPr>Q 

JTVLL»lVJNi OOJL> I /\lLO/\J^V^OlN OllTLtvi^lVrV 1 WCl/VOIVL/iJ 

FRIFQKLSEIFSDENNYSLSRELLIKEGTSKFATLE 

MNPKRAQKRPKETGI1QGTVPYLGTFLTDLVML 

DTAMKDYLYGRLINFEKRRKEFEVIAQIKLLQSA 

CNNYSIAPDEQFGAWFRAVERLSETESYNLSCEL 

EPPSESASNTLRTKKNTAIVKRWSDRQAPSTELS 
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Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glufamic Acid. F^Phenylalanine, G=Glycine, H^Histidine. 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, PHProlinc, Q=G!utamine, R=Arginine, S=Serine, 
T=Thrconine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





TSGSSHSKSCDQLRCGPYLSSGDIADALSVHSAG 

SSSSDVEEINISFVPESPDGQEKXFWESASQSSPET 

SGISSASSSTSSSSASTTPVAATRTHKRSVSGLCNS 

SSALPLYNQQVGDCCI1RVSLDVDNGNMYKSILV 

TSQDKAPAVIRKAMDKHNLEEEEPEDYELLQILS 

DDRKLKIPENANVFYAMNSTANYDFVLKKRTFT 

KGVKVKHGASSTLPRMKQKGLKIAKGIF 

3783 

A 

3 

869 

RSGQGKVYGLIGRRRFQQMDVLEGLNLLITISGK 

RNKLRVYYLSWLRNKILrlNDPEVEKKQGWTTV 

GDMEGCGHYRWKYERIKFLVIALKSSVEVYAW 

APKPYHKFMAFKSFADLPHRPLLVDLTVEEGQR 

LKVIYGSSAGFHAVDVDSGNSYDIY1PVHIQSQ1T 

PHAIIFLPNTDGMEMLLC YE DEG V Y VNT YG RI I K 

DVVLQWGEMPTSVAYICSNQ1MGWGEKAIEIRS 

VETGHLDGVFMHKRAQRLKFLCERNDKVFFASV 

RSGGSSQVYFMTLNRMCIMNW 

3784 

A 

1213 

457 

LSPRQVDGLAGLQKGLSLSLLYQFLMNGIRLGTY 

GL AEAGG YLHTAEGTHSPARS A AAG AMAG VMG 

AYLGSPIYMVKTHLQAQAASEIAVGHQYKHQG 

MFQALTEIGQKHGLVGLWRGALGGLPRVIVGSS 

TQLCTFSSTKDLLSQWEIFPPQSWKLALVAAMM 

SGIAVVLAMAPFDVACTRLYNQPHRCTGQGP\LY 

RGILDALLQTARTEGIFGMYKG1GASYFRLGPHT1 

LSLFFWDQLRSLYYTDTK 

3785 

A 

193 

813 

RRRGRHSLCGGKMLAYCVQDATVVDVEKRRNP 

SKHYVYIINVTWSDSTSQTIYRRY\SKFFDLQMQL 

LD\KFPI\ESGQKDPKQRIIPFLPGK1LFRRSHIRDV 

AVKRLKPIDEYCRALVRLPPH1SQCDEVFRFFEAR 

PEDVNPPKEQGPSPPDAVLPYGVNKGKQELKAG 

PNWPGRTHHVVNCVTQKCLFVFHFKFSSSGNKE 

SKSL 

3786 

A 

3785 

1632 

EFVGRAASTTVVTRIAWRMADAGIRRVVPSDLY 

PLVLGFLRDNQLSEVANKFAKATGATQQDANAS 

SLLDIYSFWLNRSAKVPERKLQANGPVAKKAKK 

KASSSDSEDSSEEEEEVQGPPAKKAAVPAKRVGL 

PPGKAAAKASESSSSEESSDDDDEEDQKKQPVQ 

KGVKPQAKAGQAPPKKAKSSDSDSDSSSEDEPP 

KNQKPK1TP\VTVKAQTKAPPKPARA\APKIANGK 

AASSSSSSSSSSSSDDSEEEKAAATPKKTVPKKQV 

VAJCAPVKAATTPTRKSSSSEDSSSDEEEEQKKPM 

KNKPGPYSSVPPPSAPPPKKSLGTQPPKKAVEKQ 

QPVESSEDSSDESDSSSEEEKKPPTKAVVSKATTK 

PPPAKKAAESSSDSSDSDSSEDDEAPSKPAGTTK 

NSSNKPAVTTKSPAVKPAAAPKQPVGGGQKLLT 

RKADSSSSEEESSSSEEEKTKKMVATTKPKATAK 

AALSLPAKQAPQGSRDSSSDSDSSSSEEEEEKTSK 

SAVKKKPQKVAGGAAPSKPASAKKGKAESSNSS 

SSDDSSEEEEEKLKGKGSPRPQAPKANGTSALTA 

QNGKAAKNSEEEEEEKKKAAVVVSKSGSLKKR 

KAj N b AAivb AJb I rQAKKiivLQ 1 r N ITPKKJvKGJbK 

RASSPFRRVREEEIEVDSRVADNSFDAKRGAAGD 

WGERANQVLKFTKGKSFRHEKTKKKRGSYRGG 

SISVQVNSIKFDSE 

3787 

A 

3 

5078 

IPEG/RALSAEHTSSLVPSLHITTLGQEQADLSGAV 
PASPSTGTADFPSE.TFLQPTENHASPSPVPEMPTL 
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NO: 
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to last amino 
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sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phcnylalaninc, C=Glycine, U=Histidine, 
l=lsoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparaginc, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, YV=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion 





PAEGSDGSPPATRDLLLSSKVPNLLSTSWTFPRW 

KJCDSVTAILGKNEEANVTIPLQAFPRKEVLSLHT 

VNGFVSDFSTGSVSSPIITAPRTNPLPSGPPLPSILS 

IQATQTVFPSLLAFSSTKPEVYAAAVDHSGLPAS 

APKQVRASPSSMDVYDSLTIGDMKKPATTDVFW 

SSLSAETGSLSTESIISGLQQQTNYDLNGHTISTTS 

WETHLAPTAPPNGLTSAADAIKSQDFKDTAGHS 

VTAEGFSIQDLVLGTSIEQPVQQSDMTMVGSHID 

LWPTSNNNHSRDFQTAEVAYYSPTTRHSVSHPQ 

LQLPNQPAHPLLLTSPGPTSTGSLQEMLSDGTDT 

GSEISSDINSSPERNASTPFQNILGYHSAAESSISTS 

VFPRTSSRVLRASQHPKKWTADTVSSKVQPTAA 

AAVTLFLRKSSPPALSAALVAKGTSSSPLAVASG 

PAKSSSMTTLAKKVTNKAASGPKRTPGAVHTAF 

PFTPTYMYARTGHTTSTHTA/IARKHGHCLWPVV 

YNLP/PP/GKPQAMHTGLPNPTNLEMPRASTPRPL 

TVTAALTSITASVKATRLPPLRAENTDAVLPAAS 

AAVVTTGKMASNLECQMSSKLLVKTVLFLTQRR 

VQISESLKFSIAKGLTQALRKAFHQNDVSAHVDI 

LEYSHNVTVGYYATKGKLVYLPAVVIEMLGVY 

GVSNVTADLKQHTPHLQSVAVLASPWNPQPAG 

YFQLKTVLQFVSQADNIQSCKFAQTMEQRLQKA 

FQDAERKVLNTKSNLTIQIVSTSNASQAVTLVYV 

VGNQSTFLNGTVASSLLSQLSAELVGFYLTYPPL 

TIAEPLEYPNLDISETTRDYWVITVLQGVDNSLV 

GLHNQSFARVMEQRLAQLFMMSQQQGRRFKRA 

TTLGSYTVQMVKMQRVPGPKDPAELTYYTLYN 

GKPLLGTAAAKILSTIDSQRMALTLHHVVLLQAD 

PWKNPPNNLWIIAAVLAPIAVVTVIIIIITAVLCR 

KNKlsIDFKPDTMINLPQRAKPVQGFDYAKQHLG 

QQGADEEVIPVTQETVVLPLPIRDAPQERDVAQD 

GSTIKTAKSTETRKSRSPSENGSVISNESGKPSSGR 

RSPQNVMAQQKVTKEEARKRNVPASDEEEGAV 

LFDNSSKVAAEPFDTSSGSVQLIAIKPTALPMVPP 

TSDRSQESSAVLNGEVNKALKQKSDIEHYRNKL 

RLKAKRKGYYDFPAVETSKGLTERKKMYEKAP 

KEMEHVLDPDSELCAPFTESKNRQQMKNSVYRS 

RQSLNSPSPGETEMDLLVTRERPRRGIRNSGYDT 

EPEIIEETN1DRVPEPRGYSRSRQVKGHSETSTLSS 

QPSIDEVRQQMHMLLEEAFSLASAGHAGQSRHQ 

EAYGSAQHLPYSEVVTSAPGTMTRPRAGVQWVP 

TYRPEMYQYSLPRPAYRFSQLPEMVMGSPPPPVP 

PRTGPVAVASLRRSTSDIGSKTRMAESTGPEPAQ 

LHDaASr IQMSRGPVSV IQLDQSALNYbGNl Vr 

AVFAIPAANRPGFTGYFIPTPPSSYRNQAWMSYA 

GENELPSQWADSVPLPGYIEAYPRSRYPQSSPSRL 

PRQYSQPANLHPSLEQAPAPSTAASQQSLAENDP 

SDAPLTNISTAALVKAIREEVAKLAKKQTDMFEF 

QV 

J 1 oo 

A 

7 

1717 
I / j 1 

iVLrvvJJLr I 1 Ly/\C,lvlJ\^oLJrs V JSX^J\JL//lJSJoJrl^^JS^ll^ V V 

VMVSGEPLLAKPARIVAGHEPERTNELLQIIGKC 

CLNKLSSDDAVRRVLAGEKGEVKGRASLTSRSQ 

ELDNKNVREEESRVHKNTEDRGDAEIKERSTSRD 

RKQKEELKEDRjMPREKDKDKEKAKENGGNRHR 

EGERERAKARARPDNERQKDRGNRERDRDSERK 
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NO: 
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Predicted 
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nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 
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nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycinc, H=Histidine, 
I=IsoIcucine, K=Lysine, L=Leucine, MHMethionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





KETERKSEGGKEKERLRDRDRERDRDKGKDRDR 

RRVKNGEHSWDLDRENNREHDKPEKKSASSGE 

MSKXLSDGTFKDSKAETETEISTRASKSLTTKTS 

KRRSKNSVEGDSTSDAEGDAGPAGQDKSEVPET 

PEIPNELSSNIRRIPRPGSARPAPPRVKRQDSMEAL 

QMDRSGSGKTVSNVITESHNSDNEEDDQFVVEA 

K1CDYEKLQQSPKPGEKERSLFESAWKKEKDIVS 

KEIEKLRTSIQTLCKSALPLGKIMDYIQEDVDAM 

QNELQM\YHSENRQHAEALQQEQRITDCAVEP\L 

KAELA\ELEQLIKD\Q\QDKICAVKANILKNEEKIQ 

KMVYSINLTSRR 

3789 

A 

1 

4369 

MRTLGTCLATLAGLLLTAAGETFSGGCLFDEPYS 

TCGYSQSEGDDTOWEQVNTLTKPTSDPWNIPSGS 

FMLVNASGRPEGQRAHLLLPQLKENDTHCIDFH 

YFVSSKSNSPPGLLNVYVKVNNGPLGNPIWNISG 

DPTRTWNRAELAISTFWPNFYQVIFEVITSGHQG 

YLAIDEVKVLGHPCTRTPHFLRIQNVEVNAGQFA 

TFQCSAIGRTVAGDRLWLQGIDVRDAPLKEIKVT 

SSRRFIASTOVVNTTKRDAGKYRCMI\RTEGGVGI 

SNYAELWVKEPPVPIAPPQLASVGATYLW1QLN 

ANSINGDGPIVAREVEYCTASGSWNDRQPVDSTS 

YKIGHLDPDTEYEISVLLTRPGEGGTGSPGPALRT 

RTKCADPMRGPRKLEVVEVKSRQITIRWEPFGY 

NVraCHSYNLTVHYCYQVGGQEQVREEVSWDT 

ENSHPQHTITNLSPYTNVSVKLILMNPEGRKESQ 

ELIVQTDEDLPGAVPTESIQGSTFEEKIFLQWREP 

TQTYGVITLYEITYKAVSSFDPEIDLSNQSGRVSK 

LGNETHFLFFGLYPGTTYSFTIRASTAKGFGPPAT 

NQFTTKISAPSMPAYELETPLNQTDNTVTVMLKP 

AHSRGAPVSVYQIWEEERPRRTKKTTEILKCYP 

VPIHFQNASLLNSQYYFAAEFPADSLQAAQPFTIG 

DNKTYNGYWNTPLLPYKSYRIYFQAASRANGET 

KIDCVQVATKGAATPKPVPEPEKQTDHTVKIAG 

VIAGILLFVIIFLGVVLVMKKRKLUKKRKETMSS 

TRQEIDLWIGELNGPRSYAEQGTKLATRAFSFMD 

THNLNGRSVSSPSSFTMKTNTLSTSVPNSYYPDE 

THTMASDTSSLVQSHTYKKREPADVPYQTGQLH 

PAIRVADLLQHITQMKCAEGYGFKEEYESFFEGQ 

SAPWDSAKKDENRMKNRYGNIIAYDHSRVRLQT 

IEGDTNSDYINGNYIDGYHRPNHYIATQGPMQET 

IYDFWRMVWHENTASIIMVTNLVEVGRVKCCK 

YWPDDTE1YKDIKVTLBETELLAEYVIRTFAVEKR 

GVHEIREIRQFHFTGWPDHGVPYHATGLLGFVR 

QVKSKSPPSAGPLVVHCSAGAGRTGCF1VIDIML 

DMAEREGVVDIYNCVRELRSRRVNMVQTEEQY 

VFIHDAILEACLCGDTSVPASQVRSLYYDMNKLD 

PQTNSSQIKEEFRTLNMVTPTLRVEDCSIALLPRN 

HEKNRCMDILPPDRCLPFLITIDGESSNYINAALM 

n^VT^OP^AFiVTOJ-TPT PMTW HFWR T VT DYWfTS 
LsO I ivyr JAf 1 V 1 yilrLinN 1 V ryJL/r W IvL» V LiU X I O 

VVMLNDVDPAQLCPQYWPENGVHRHGPIQVEF 

VSADLEEDnSRIFRJYNAARPQDGYRMVQQFQFL 

GWPMYRDTPVSKRSFLKLIRQVDKWQEEYNGG 

EGRTVVHCLNGGGRSGTFCAISIVCEMLRHQRTV 

DVFHAVKTLRNNKPNMVDLLDQYKFCYEVALE 
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Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Lcucine, M=Methionme, 
N=Asparagine, P=Proline, Q=Giutaminc, R=Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 





YLNSG 

3790 

A 

261 

485 

EEQTPLHIASRLGKTEIVQLLLQHMAHPDAATTN 
GYTPLHISAREGQV\DV\ASVLLGRQGAAHSFRLT 
KVRRMTS 

3791 

A 

1 

5874 

LPPVTMSGKYIMEEHDSYSDQVWSIDELPSKQG 

Y YLQGN YLRC V AE VG SFEHNLTTDLLNHL VFVQ 

KVFMKEVNEVIQKVSGGEQPIPLWNEHDGTADG 

DKPKILLYSLNLQFKGIQVTATTPSMRAVRFETG 

LIELELSNRLQTKASPGSSSYLKLFGKCQVDLNL 

ALGQIVKHQVYEEAGSDFHQVAYFKTRIGLRNA 

LREEISGSSDREAVL1TLNRPIVYAQPVAFDRAVL 

FWLNYK\AAYDNWNEQRMALHKDIHMATKEVV 

DMLPGIQQTSAQAFGTPFLQLTVNDLGICLPITNT 

AQSNHTGDLDTG SALVLTIESTLITACSSESLVSK 

GHFKNFCIRFADGFETSWDDWKPEIHGDLVMNA 

CVVPDGTYEVCSRTTGQAAAESSSAGTWTLNVL 

WKMCGIDVHMDPNIGKRLNALGNTLTTLTGEED 

IDDIADLNSVNIADLSDEDEVDTMSPTIHTEATDY 

RRQAASASQPGELRGRKIMKR1VDIRELNEQAKV 

1DDLKXLGASEGTTNQEIQRYQQLESVAVNDIRR 

DVRKKLRRSSMRAASLKDKWGLSYKPSYSRSKS 

ISASGRPPLKRMERASSRVGETEELPEIRVDAASP 

GPRVTFNIQDTFPEETELDLLSVTIEGPSHYSSNSE 

GSCSVFSSPKTPGGFSPGIPFQTEEGRRDDSLSSTS 

EDSEKDEKDEDHERERFYIYRKPSHTSRKKATGF 

AAVHQLFTERWPTTPVNRSLSGTATERNIDFELD 

IRVEIDSGKCVLHPTTLLQEHDDISLRRSYDRSSR 

SLDQDSPSKKKKFQTWASTTHLMTGKKVPSSL 

QTKPSDLETTVFYIPGVDVKLHYNSKTLKTESPN 

ASRGSSLPRTLSKESKLYGMKDSATSPPSPPLPST 

VQSKTNTLLPPQPPPIPAAKGKGSGGVKTAKLYA 

WVALQSLPEEMVISPCLLDFLEKALETIPITPVER 

NYTAVSSQDEDMGHFEIPDPMEES\TTSLVS\SSTS 

AYSSFPVDVVVYVRVQPSQIKFSCLPVSRVECML 

KLPSLDLVFSSNRGELETLGTTYPAETLSPGGNA 

TQSGTKTSASKTGIPGSSGLGSPLGRSRHSSSQSD 

LTSSSSSSSGLSFTACMSDFSLYVFHPYGAGKQIT 

AVSGLTPGSGGLGNVDEEPTSVTGRKDSLSINLE 

FVKVSLSRIRRSGGASFFESQSVSKSASKMDTTLI 

NISAVCDIGSASFKYDMRRLSEILAFPRAWYRRSI 

ARRLFLGDQTINLPTSGPGTPDSIEGVSQHLSPESS 

RKAYCKTWEQPSQSASFTHMPQSPNVFNEHMTN 

STMSPGTVGQSLKSPASIRSRSVSDSSVPRRDSLS 

KTSTPFNKSNKAASQQGTPWETLWFAENLKQL 

NVQMNMSNVMGNTTWTTSGLKSQGRLSVGSNR 

DREISMSVGLGRSQLDSKGGWGGTIDVNALEM 

VAHISEHPNQQPSHKIQITMGSTEARVDYMGSSIL 

MGIFSNADLKLQDEWKVNLYNTLDSSITDKSEIF 

VHGDLKWDIFQVMJSRSTTPDLIKIGMKLQEFFT 

firYETYTQVT? A 1 CT\\/r3P\/PVT DPT^TN/fTQTvTT CT^QCf^P 
V^l^rlJ 1 olvlvALo 1 Wurvri Lrriv 1M1 oINLilJvooV^ri 

QLLDAAHHRHWPGVLKWSGCHISLFQIPLPEDG 

MQFGGSMSLHGNHMTLACFHGPNFRSKSWALF 

HLEEPNIAFWTEAQKIWEDGSSDHSTYIVQTLDF 

HLGHNTMVTKPCGALESPMATITKITRRRHENPP 

HGVASVKEWFNYVTATRNEELNLLRNVDANNT 
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E=Glutamic Acid. F=Phenylalanine, G=Glycine. H=Histidine, 
I=Isoleucine, K=Lysine, L~Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





ENSTTVKNSSLLSGFRGGSSYNHETETIFALPRM 

QLDFKSIHVQEPQEPSLQDASLKJPKVECSVVTEF 

TDH1CVTMDAELIMFLHDLVSAYLKEKEKAIFPP 

RILSTRPGQKSPII1HDDNSSDKDREDSITYTTVDW 

RDFMCNTWHLEPTLRLIS WTGRKIDP VG VD YILQ 

KLGFHHARTTIPKWLQRGVMDPLDKVLSV^ 

LGTALQDEKEKKGKDKEEH 

3792 

A 

I 

364 

QNGSTPLHHAASKNRHEIALMLLEGGANPDGKD 
HYEATAKHQATAKGNFKMIHILLYYKASTIIQDT 
bGNTPPHLVCDXRVEEAKJ^LVSQuA/SlYIENKEE 
KDP/LQVAKGALGLVLKRMVEG 

3793 

A 

2 

340 

DIVPNPKMAPLGDEAPTLEKVLTPELSEEEVSTR 

r\r\T ATT TT TFO OrT" A T /\T/1 7I/\7T , A / A T/PrvTlPfl/M '1 > A T TT* 

DDIQFHHFSSEEALQKVKYFVAKEDPSSQEEAHT 

PEAPPPQPPSSERCLGEMKCTLVRGDSSPRQAEL 

KSGPASRPAL 

3794 

A 

421 

158 

SYAWGEDYTYKFFEVILIDPFHKAERRNPDTQWI 
SKAVYK11REMCGLTSTGRKSHGLEKDRMFPHAI 
GGSCRAA*RRRKTLQFPCYH 

3795 

A 

24 

592 

GGMDSRVSGTTSNGETKPVYPVMEKKEEDGTLE 

RGHWKKKMEFVLSVAGEIIGLGNVWRFPYLCYK 

NGGGAFFIPYLVFLFTCGIPVFLLETALGQYTSQG 

GVTAWRKICPIFEGIGYASQMIVILLNVYYIIVLA 

WALFYLFSSFTIDLPWGGCYHEWNTEHCMEFQK 

TNGSLNGTSENATSPVIEFW 

3796 

A 

3 

592 

KPASTYSTSQPSMAPLLPIRTLPL1L1LLALLSPGA 

ADFNISSLSGLLSPALTESLLVALPPCHLTGGNAT 

LMVRRANDSKVVTSSFVVPPCRGRRELVSVVDS 

GAGFTVTRLSAYQVTNLVPGTKFYISYLVKKGT 

ATESSRE1PMFTLPRRNMESIGLGMARTGGMVVI 

TVLLSVAMFLLVLGFIIALALGSRK 

3797 

A 

1 

1556 

ATRLLRGSGSWGCSRLRFGPPAYRRFSSGGAYPN 

IPLSSPLPGVPKPVFATVDGQEKFETKVTTLDNGL 

RVASQNKFGQFCTVG1LINSGSRYEAKYLSGIAH 

FLEKLAFSSTARFDSKDEILLTLEKHGGICDCQTS 

RDTTMYAVSADSKGLDTVVALLADVVLQPRLT 

DEEVEMTRMAVQFELEDLNLRPDPEPLLTEMIHE 

AAYRENTVGLHRFCPTENVAKINREVLHSYLRN 

YYTPDRMVLAGVGVEHEHLVDCARKYLLGVQP 

AWGSAEAVDIDRSVAQYTGGIAKLERDMSNVSL 

GmiPELTfflMVGLESCSFLEEDOTFAVLNMMM 

GGGGSFSAGGPGKGMFSRLYLNVLNRHHW 

ATSYHHSYEDTGLLCIHASADPRQVREMVEIITK 

EFILMGGWDTVELERAKTQLTSMLMMNLESRP 

VIFEDVGRQVLATRSRKLPHELCTLIRNVKPEDV 

KRVASKMLRGKPAVAALGDLTDLPTYEHIQTAL 

SSKDGRLPRTYRLFR 

3798 

A 

73 

759 

KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 
QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
LPWFLNDRPNIKCPKGGLAAYSTSVNLTSDGQV 

r AOOCTyfA VXIVm FXFCnnVTCA T U A A O T7T A A "MIT 1 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLSIV 

M1LVDTVGFMALWGISYNAVSLINLVS 

3799 

A 

73 

759 

KRLVEAGVPRTFDGIVGEGGAQSRSCWPWGVTA 
QTPAFSADSLNCLKNCMSITMGSVRPSVEQFHKY 
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SEQ n> 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue or 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycinc, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine t M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Scrine, 
T=Threonine, V=VaIine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





LPWFLNDRPN1KCPKGGLAAYSTSVNLTSDGQV 

LASRFMAYHKPLKNSQDYTEALRAARELAANIT 

ADLRKVPGTDPAFEVFPYTITNVFYEQYLTILPEG 

LFMLSLCLVPTFAVSCLLLGLDLRSGLLNLLS1V 

MILVDTVGFMALWGISYNAVSLINLVS 

3800 

A 

250 

1032 

GIFRSLRVLFPLFSVGRPQFARSLSAAPQLSDTAD 
TMGFGDLKSPAGLQVLNDYLADKSYIEGYVPSQ 

A D V A VFF A V ^SPPP A THAT R WVNW IK < 5VFk r F 

KASLPGVKKALGKYGPADVEDTTGSGATDSKD 
DDDIDLFGSDDEEESEEAKRLREERLAQYESKKA 
KKPALVAKSSILLDVKPWDDETDMAKLEECVRS 
IQADGLVWGSSKLVPVGYGIKKLQIQCVVEDDK 
VGTDMT FFOlTAFFTWVO'sMTWA AFNK1 

3801 

A 

155 

656 

SREMELVTFRDVAEEFSPEEWKCLDPAQQNLYR 

DVMLENYRNLVSLGFVISNPDLVTCLEQIKEPCN 

LKIHETAAKPPAICSPFSQDLSPVQGIEDSFHKL1L 

KRYEKCGHENLQLRKGCKRVNECKVQKGVNNG 

VYQCLSTTQSKIFQCNTCVRVFSTSSHSNKHK 

3802 

A 

1 

1428 

VTVSPETHMDLTKGCVTFEDIAIYFSQDEWGLLD 

EAQRLLYLEVMLENFALVASLGCGHGTEDEETP 

SDQNVSVGVSQSKAGSSTQKTQSCEMCVPVLKD 

ILHLADLPGQKPYLVGECTNHHQHQKHHSAKKS 

LKRDMDRASYVKCCLFCMSLKPFRJKWEVGKDL 

PAMLRLLRSLVFPGGKKPGTITECGED1RSQKSH 

YKSGECGKASRHKHTPVYHPRVYTGKKLYECSK 

CGKAFRGK Y SL VQHQRVHTG ERP WECN ECG KF 

FSQTSHLNDHRRIHTGERPYECSECGKLFRQNSS 

L v JJJrlv^Jvlrl 1 vj/\tvr I eA^ov^^vji Ivor o v^IV/\ I JL V tviiv^j 

R VHTGERPYKCGECGNSFSQS A ILNQHRR1HTG A 
KPYECGQCGKSFSQKATLIKHQRVHTGERPYKC 
GDCGKSFSQSSILIQHRRIHTGARPYECGQCGKSF 
SQKSGLIQHQVVHTGERPYECNKCGNSFSQCSSL 

3803 

A 

193 

617 

LFPFLGSESKNGEADSSDKEMKHGQKSPTGKQTS 
QHLKRLKKSGLGHLKWTKAEDIDIETPGSJLVNT 
NLRAL1NKHTFASI POHFOOY1 I 11 T PEVDROMG 
SDGILRLSTS ALNNEFF A Y AAQG WKQRL AEGKF 
VFSIIM 

3804 

A 

197 

479 

SSSRASPPEHPSSQAHCGPLVLSHACPEVTNKWS 
TGSSSSPNSSWVSSPLOPEGLSGSSRMKGGSATKI 
LLETLLLAAHMTADQGIASSQRCLL 

3805 

A 

1 

385 

QSADTLFPGD1NFNVSGLFSAVTLQDTVSDRLAS 
EELPSTAVPTPATTPAPAPAPAPATAPALVSAAT 
KERTESEVPPlxPASPKVTRSPPETAAPVEDMARR 
SELAVGGEEGTEGGRGEGTGSPMSSY 

3806 

A 

47 

1033 ! 

LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNLLQ 

PQAPGHDMTSIPFPGDRLLQVDGVILCGLTHKQA 

VQCLKGPGQVARLVLERRVPRSTQQCPSANDSM 

GDERTAVSLVTALPGRPSSCVSVTDGPKF*SS1SI* 

K1UANGLGFSFVOMEKESCSHLKSDLVRIKRLFP 

GHPAEENGAIAAGDIILGREWEGPRKASSSRCRG 

SWAMQLSVQAGPSFASYYPAAVEVLHLLRGAPQ 

EVTLLLCRPPPGALPELEQEWQTPELSADKEFTR 

ATCTDSCTSPILGSRGQLGGTVPPQMQGKAWGL | 

RPESSQKA1REGTMGAKTERDLGPVP 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alaninc C=Cystcine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycinc,H=Histidine, 
l=Isoleucine, K-Lysine, L=Lcucine, M=Mcthioninc, 
N=Asparagine, P=Proline, Q=Glutamtne, R=Arginine, S^Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unkno\vn, *=Stop cod o n, A= possible nucleotide deletion, 
\=possible nucleotide insertion 

jOU / 

A 

OjO 

IZJo 

AVLWGSERTPPYR*GN*NQRGAVPCLRPHRLRP 

QDKFLVLASDGLWDMLSNEDVVRLVVGHLAEA 

DWHKTDLAQRPANLGLMQSLLLQRKASGLHEA 

DQNAATRLIRHAIGNNEYGEMEAERLAAMLTLP 

EDLARMYRDDITVTVVYFNSESIGAYYKGG 

3808 

A 

26 

2195 

SQYSESVAGRQASPERLLGSYHAMASTVEGGDT 

ALLPEFPRGPLDAYRARASFSWKELALFTEGEG 

MLRFKKTIFSALENDPLFARSPGADLSLEKYREL 

NFLRCKRIFEYDFLSVEDMFKSPLKVPALIQCLG 

MYDSSLAAKYLLHSLVFGSAVYSSGSERHLTYIQ 

K1FRMEIFGCFALTELSHGSNTKAIRTTAHYDPAT 

EEFIIHSPDFEAAKFWVGNMGKTATHAVVFAKL 

CVPGDQCHGLHPFIVQIRDPKTLLPMPGVMVGDI 

GKKLGQNGLDNGFAMFHKVRVPRQSLLNRMGD 

VTPEGTYVSPFKDVRQRFGASLGSLSSGRVSIVSL 

AILNLKLAVAIALRFSATRRQFGPTEEEEIPVLEY 

PMQQWRLLPYLAAVYALDHFSKSLFLDLVELQR 

GLASGDRSARQAELGREIHALASASKPLASWTT 

QQGIQECREACGGHGYLAMNRLGVLRDDNDPN 

CTYEGDNNILLQQTSNYLLGLLAHQVHDGACFR 

SPLKSVDFLDAYPGILDQKFEVSSVADCLDSAVA 

LAA Y R W L V C Y LLKh 1 Y QKLN QhKKbCjSoDrEAK 

NKCQVSHGRPLALAFVELTVVQRFHEHVHQPSV 

PPSLRAVLGRLSALYALWSLSRHAALLYRGGYF 

SGEQAGEVLESAVLALCSQLKDDAVALVDV1AP 

PDFVLDSPIGRADGELYKNLWGAVLQESKVLER 

ASWWPEFSVNKPVIGSLKSKL 

3809 

A 

117 

830 

CFGIMERVGCTLTTTYAHPRPTPTNFLPAISTMAS 
SYRDRFPHSNLTHSLSLPWRPSTYYKVASNSPSV 

YRSNLTNYQESNTSRHNSEKLRVDTSRLIQDKYQ 

QTRKTQADTTQNLGERVNDIGFWKSEIIHELDEM 

IGETNALTDVKKRLERALMETEAPLQVARECLF 

l^KRMGlDLVHDEVEAQLLTVNVGEMHQSQA 

A 


A 

j 

315 

HPATRRPASGPAMGKTNSKLAPEVLEDLVQNTE 
FSEQELKQWYKGFLKDCPSGILNLEEFQQLYIKF 
FPYGDASKFAQHAFRTFDKNGDGTIDFREFICAL 
SVTSRGSFEQKLNWAFEiVrYDLDGDGRJTRLEML 
EIIE 

3811 

A 

81 

1147 

GCGYGCSGAGGAAIGEPMAKWGEGDPRWIVEE 
RADATmn^WHWTERDASNWSTDKLKTLFLAV 
QVQlSOEEGKCEVTEVSKLDGEASnsTNRKGKLIFFY 
EWSVKLNWTGTSKSGVQYKGHVEPNLSDENSV 
DEVEISVSLAKDEPDTNLVALMKEEGVKLLREA 
MCrl Y lo 1 Lis. 1 rlr 1 l^CrMlL/r 1 MlNuti Vurvu^rAL 
KTEERKAKPAPSKTQAJRPVGVKIPTCKITLKETFL 
TSPEELYRVFTTOFLVOAFTHAPATI EADRGGKF 

X Ul l-vt./l-< J. 1\ V 1 1 1 \ V Will 1 LLi\.X A 1 X-ii— *xTLJL/AWJ vJJLvJ. 

rCVlVDGNVSGEl^LVPEKJilVMKWRFKSWPEG 
HFATITLTFIDKNGETELCMEGRGIPAPEEERTRQ 
GWQRYYFEGIKQTFGYGARLF 

3812 

A 

20 

558 

PCGTAASTl^YDRRAKCRQQQQQQQNGGQNKV 
l^AKXKTSPAl^VSSESGTSGQl^PSSTSVPTIAS 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalaninc, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosine, 
X=llnknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





SSAPVSIWSPASISPLSDPLSTSSSCMQRSYPMTYT 
QASGYSQGYAGSTSYFGGMDCGSYLTPMHHQL 
PGPGATLSPMGTNAVTSHLNQSPASLSTQGYGAS 
KLWGFNFNH 

3813 

A 

1 

1016 

CTEPPRRSTRTPAALASLRPYTDYVVVSDQILQES 

EDFFTL1ESHEGKPLKLMVYNSKSDSCREVTVTP 

NAAWGGEGSLGCGIGYGYLHRIPTQPPSYHKKPR 

GTPPPSALPLGAPPPDALPPGPTPEDSPSLETGSRQ 

SDYMEALLQAPGSSMEDPLPGPGSPSHSAPDPDG 

LPHFMETPLQPPPPVQRVMDPGFLDVSGISLLDN 

SNASVWPSLPSSTELTTTAVSTSGPEDICSSSSSHE 

RGGEATWSGSEFEVSFLDSPGAQAQADHLPQLT 

LPDSLTSAASPEDGLSAELLEAQAEEEPASTEGLD 

TGTEAEGLDSQAQISTTE*HPGL*QGP 

3814 

A 

2 

884 

VFWQVRNAGSSPLSAACPLFRTPAPQPCGSWGR 

CC1PHASTGCRPMAERGELDLTGAKQNTGVWLV 

KVPKYLSQQWAKASGRGEVGKLRIAKTQGRTE 

VSFTLNEDLANIHDIGGKPASVSAPREHPFVLQSV 

GGQTLTVFTESSSDKLSLEGIVVQRAECRPAASE 

NYMRLKRLQIEESSKPVRLSQQLDKVVTTNYKP 

VANHQYNIEYERKKKEDGKRARADKQHVLDML 

FSAFEKHQYYNLKDLVDITKQPVVYLKEILKEIG 

VQNVKGIHKNTWELKPEYRHYQGEEKSD 

3815 

A 

17 

411 

NIGDWEDIGKSPERJ1QYYGPATWAQDGSRGYCT 
PIYMLNHIIRLQAVLEIIMNERANALDLLAQQTTK 
MRNANYQNRLALDYLLAHEGGV*GKFSLTNCC 
LEIDDNGKAIMEITARMRKLAHIPVQTWER 

3816 

A 

3 

1172 

SHWQRRDRRCVRNMAERGRKRPCGPGEHGQRI 

EWRKWKQQKKEEKKKWKDLKLMKKLERQRAQ 

EEQAKRLEEEEAAAEKEDRGRPYTLiSVALPGSIL 

DNAQSPELRTYLAGQIARACAIFCVDEIVVFDEE 

GQDAKTVEGEFTGVGKKGQACVQLARILQYLEC 

PQYLRKAFFPKHQDLQFAGLLNPLDSPHHMRQD 

EESEFREGVVVDRPTRPGHGSFVNCGMKKEVKI 

DKNLEPGLRVTVRLNQQQHPDCKTYHGKWSS 

QDPRTKAGLYWGYTVRLASCLSAVFAEAPFQDG 

YDLTIGTSERGSDVASAQLPNFRHALVVFGGLQG 

LEAGADADPNLEVAEPSVLFDLYVNTCPGQGSR 

T1RTCEAILISLAALQPGLIQAGARHT 

3817 

A 

246 

1197 

FLSAGMSNFTHYAYLLMIESLMLGKVPPHVPSH 

HFIFHDDGSARQKGESDYKVIIQQWFSKSGPWTT 

SSNVTWGLLELQQSISESAVLTIPPGDSGAGSNLI 

TMFLRNRKETDLCSGRSKVNRGWNSGRCKQRG 

KTEQPGEPLEHVYVTIKHAVALESRHQKGELQC 

LIKMCIPLSKPLQMFFSPPHWEAWLQRVQQLAK 

NTTIYFRQRLQEMGFIIYGNENASVVPLLLYMPG 

KVAAFARHMLEKKIGVVVVGFPATPLAEARARF 

CVSAAHTREMLDTVLEALDEMGDLLQLKYSRH 

KKSARPELYDETSFELED 

JO I o 

A 


/ay 

\ior\c pp CTr/^cccrCrf^w/TvT/^trxroT t "\ r/~\n o t?x /hp/~\ a n/^* 
Nr^ooobbuobblrl^VNOHNKJ^LVQRSbV 

QYTVDVEGHGCTF1QATLKYNVLLPKKASGFSLS 

LEIVKNYSSTAFDLTVTLKYTGIRNKSSMVVIDV 

KMLSGFTPTMSSIEELENKGQVMKTEVKNDHVL 

FYLENVFGRADSFTFSVEQSNLVFNIQPAPGMVY 

DYYEKEEYALAFYHINSSSVSE 
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SEQ ID 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 

I=Isoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 

3819 

A 

1 

1483 

RIPDSIISRGVQGLPRDTASLSTTPSESPRAQATSR 

LSTASCPTPKVQSRCSSKENILRASHSAVDITKVA 

RJRHRMSPFPLTSMDKAFITVLEMTPVLGTEirNYR 

DGMGRVLAQDVYAKDNLPPFPASVKDGYAVRA 

ADGPGDRFIIGESQAGEQPTQTVMPGQVMRVTT 

GAPIPCGADAVVQVEDTELIRESDDGTEELEVRIL 

VQARPGQDIRPIGHDIKRGECVLAKGTHMGPSEI 

GLLATVGVTEVEVNKFPVVAVMSTGNELLNPED 

DLLPGIGRDSNRSTLLATIQEHGYPTINLGIVGDN 

PDDLLNALNEG1SRADVI1TSGGVSMGEKDYLKQ 

VLDIDLHAQIHFGRVFMKPGLPTTFATLDBDGVR 

KIIFALPGNPVSAVVTCNLFVVPALRKMQGILDP 

RPTIIKARLSCDVKLDPRPEYHRCILTWHHQEPLP 

WAQSTGNQMSSRLMSMRSANGLLMLPPKTEQY 

VELHKGEVVDVMVIGRL 

3820 

A 

2216 

487 

PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGY1 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGC1DVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAG1LCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNIIKGNEEGYFGTRRLNAYTGVVYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMH1 

FFTTFAL 

3821 

A 

2216 

487 

PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCK 

QVCSTVGGSAICSCFPGYAIMADGVSCEDQDECL 

MGAHDCSRRQFCVNTLGSFYCVNHTVLCADGYI 

LNAHRKCVDINECVTDLHTCSRGEHCVNTLGSF 

HCYKALTCEPGYALKDGECEDVDECAMGTHTC 

QPGFLCQNTKGSFYCQARQRCMDGFLQDPEGNC 

VDINECTSLSEPCRPGFSCINTVGSYTCQRNPLIC 

ARGYHASDDGTKCVDVNECETGVHRCGEGQVC 

HNLPGSYRCDCKAGFQRDAFGRGCIDVNECWAS 

PGRLCQHTCENTLGSYRCSCASGFLLAADGKRC 

EDVNECEAQRCSQECANIYGSYQCYCRQGYQLA 

EDGHTCTDIDECAQGAGELCTFRCLNVPGSYQCA 

CPEQGYTMTANGRSCKDVDECALGTHNCSEAET 

CHNIQGSFRCLRFECPPNYVQVSKTKCERTTCHD 

FLECQNSPARITHYQLNFQTGLLVPAHIFRIGPAP 

AFTGDTIALNI1KGNEEGYFGTRRLNAYTGWYL 

QRAVLEPRDFALDVEMKLWRQGSVTTFLAKMHI 

FFTTFAL 

3822 

A 

2502 

1540 

MAAATRGCRPWGSLLGLLGLVSAAAAAWDLAS 
LRCTLGAFCECDFRPDLPGLECDLAQHLAGQHL 
AKALVVKALKAFVRDPAPTKPLVLSLHGWTGTG 
KSYVSSLLAHYLFQGGLRSPRVHHFSPVLHFPHP 


443 


WO 01/57190 


PCT/USO 1/04098 


SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Pheny1alanine, G=Glycine, H=Histidine, 
I=Isofeucine, K=Lysine > L= Leucine, M=Met hionine, 
N=Asparagine, P=Proline, Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Va1ine, W=Tryptop!ian, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possib!e nucleotide insertion 





SHTFR YKKDT k^WVOfTNII TArr,R^ PI prncA/mf 
onicrv i ]\JSj^i_,rvo w v v^vjin l, i /A.v^orvoL.r Lr IJiiiYliJrv 

MPPGLMEVLRPFLGSSWVVYGTNYRKAIFfFISN 

TGGEQIMQVALEAWRSRRDREEILLQELEPVISR 

AVLDWHHGFSNSGIMEERJLLDAVVPFLPLQRHH 

VRHCVLNELAQLGLEPRDEVVQAVLDSTTFFPE 

DEQLFSSNGCKTVASRJAFFL 

3823 

A 

1 

3174 

YGCEKTTEGRIPLKNIYRLFSADRKRVETALEAC 

SLPSSRNDSIPQEDFTPEVYRVFLNNLCPRPEIDN1 

FSEFGAKSKPYLTVDQMMDFJNLKQRDPRLNEIL 

YPPLKQEQVQVL1EKYEPNNSLARKGQISVDGFM 

RYLSGEENGVVSPEKLDLNEDMSQPLSHYFINSS 

HNTYLTAGQLAGNSSVEMYRQVLLSGCRCVELD 

CWKGRTAEEEPVITHGFTMTTEISFKEVIEAIAEC 

AFKTSPFPILLSFENHVDSPKQQAKMAEYCRLIFG 

DALLMEPLEKYPLESGVPLPSPMDLMYK1LVKN 

KJCKSHKSSEGSGKKKLSEQASNTYSDSSSMFEPS 

SPGAGEADTESDDDDDDDDCKKSSMDEGTAGSE 

AMATEEMSNLVNY1QPVKFESFEISKKRNKSFEM 

SSFVETKGLEQLTKSPVEFVEYNKMQLSRIYPKG 

TRVDSSN YMPQLF WN A GCQM V ALNFQTMDL A 

MQINMGMYEYNGKSGYRLKPEFMRRPDKHFDP 

FTEGIVDGIVANTLSVKI1SGQFLSDKKVGTYVEV 

DMFGLPVDTRRKAFKTKTSQGNAVNPVWEEEPI 

VFKKVVLPTLACLRJAVYEEGGKFIGHRJLPVQAI 

RPGYHYICLRNERNQPLTLPAVFVY1EVKDYVPD 

TYADVIEALSNPIRYVNLMEQRAKQLAALTLEDE 

EEVKKEADPGETPSEAPSEARTTPAENGVNHTTT 

LTPKPPSQALHSQPAPGSVKAPAKTEDL1QSVLTE 

VEAQTIEELKQQKSFVKLQKKHYKEMKDLVKR 

HHKKTTDLIKEHTTKYNEIQNDYLRRRAALEKS 

AKKDSKKKSEPSSPDHGSSTIEQDLAALDAEMTQ 

KLE)LKDKQQQQLLNLRQEQYYSEKYQKREHIK 

LiL>1 \£ JSJL 1UV Anr^l^N IN v^Li<d\J^jV±ll^ 

KMDKKRQEKITEAKSKDKSQMEEEKTEMIRSYI 

QEWQYIKRLEEAQSKRQEKLVEKHKEIRQQILD 

EBCPKLQVELEQEYQDKFKJRLPLEILEFVQEAMKG 

KISEDSNHGSAPLSLSSDPGKVNHKTPSSEELGGD 

TPGKFFDTPI 

3824 

A 

1 

426 

ILHWFVHRWSGRNNREKIGVHVGFEEILNMEPY 
CCRETLKSLRPECF1YDLSAVVMHHGKGFGSGH 
YTAYCYNSEGGFWVHCNDSKLSMCTMDEVCKA 
OAYTT FYTORVTFNOl-r^KI I PPFT T T n^OHPTsTFn 

ADTSSNEILS 

3825 

A 

3 

364 

GIRAKFPNKIPVVVERYPRETFLPPLDKTKFLVPQ 
ELTMTQFLSIIRSRMVLRATEAFYLLVKNKSLVS 
MSATMAEIYRDYKDEDGFVYMTYASQETFGCLE 
SAAPRDGSSLEDRPLHPL 

3826 

A 

1 

1237 

PEKKFERECREAEKAQQSYERLDNDTNATKADV 

EKAKQQLNLRTHMADENKNEYAAQLQNFNGEQ 

HKHFYVVIPQIYKQLQEMDERRTIKLSECYRGFA 

DSERKVIPIISKCLEGMILAAKSVDERRDSQMVV 

DSFKSGFEPPGDFPFEDYSQHIYRTISDGTISASKQ 

ESGKMDAKLTTVGKAKGKLWLFGKKPKGPALED 

FSHLPPEQRRKKLQQRIDELNRELQKESDQKDAL 

NKMKJDVYEK^QMGDPGSLQPKIAET^^ 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=GIycine, H=Histidine, 
I-Isoleucine, K=Lysine, ly=Leucine, M-Methionine, 
N=Asparagine, P=ProIine, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrconinc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possi ble nucleotide insertion 





LRMEIHKNEAWLSEVEGKTGGRGDRRHSSDINH 
LVTQGRESPEGSYTDDANQEVRGPPQQHGHHNE 
FDDEFEDDDPLPAIGHCKAIYPFDGHNEGTLAMK 
EGEVLYIIEEDKGDGWTRARRQNGEEGYVPTSYI 
DVTLEKNSKGS 

3827 

A 

2 

1584 

INPVSSAVNGEAHSSHETRGQNSNALPSVLLELL 

SQSCLIPAMSSYLRNDSVLDMARHVPLYRALLEL 

LRAIASCAAMVPLLLPLSTENGEEEEEQSECQTS 

VGTLLAKMKTCVDTYTNRLRSKRENVKTGVKP 

DASDQEPEGLTLLVPDIQKTAE1VYAATTSLRQA 

NQEKKLGEYSKKAAMKPKPLSVLKSLEEKYVAV 

MKKLQFDTFEMVSEDEDGKLGFKVNYHYMSQV 

KNANDANSAARARRLAQEAVTLSTSLPLSSSSSV 

FVRCDEERLDIMKVLITGPADTPYANGCFEFDVY 

FPQDYPSSPPLVNLETTGGHSVRFNPNLYNDGKV 

CLS1LN 1 WHUKrbbKWNrl^ 1 borLQVLVb VQ^Ll 

LVAEPYFNEPGYERSRGTPSGTQSSREYDGNIRQ 

ATVKWAMLEQIRNPSPCFKEViHKHFYLKRVEIM 

AQCEEW1ADIQQYSSDKRVGRTMSHHAAALKRH 

TAQLREELLKLPCPEGLDPDTDDAPEVCRATTGA 

EETLMHDQVKPSSSKELPSDFQL 

3828 

A 

1415 

845 

PRVPATLVSLDPWHCFPTAGRLAGSTWVPPACT 

LQLGPSSEHELDNHRAPLLSLPSQESLSFTPWYLV 

ACKPLFHIFCPLFACFMQEGKVQYLFLHLSHMRL 

LNYYFFPFLAPESLMQALEDLDYLAALDNDGNL 

SEFGIIMSEFPLDPQLSKSILASCEFDCVDEVLTIA 

AMV 1 OJJ^INJJ YororrAINLrl 

3829 

A 

199 

683 

VDHTPVLSKPQCFSSVKWGATLSARSQKTSGIGR 
LMVHVIEATELKACKPNGKSNPYCEISMGSQSYT 
TRTIQDTLNPKWNFNCQFFIKDLYQDVLCLTLFD 
RDQFSPDDFLGRTEIPVAK1RTEQESKGPMTRRLL 
LHEVPTGEVWVRFDLQLFEQKTLL 

3830 

A 

1747 

404 

RKMMEESGfETTPPGTPPPNPAGLAATAMSSTPV 
PLAATSSFSSPNVSSMESFPPLAYSTPQPPLPPVRP 
SAPLPFVPPPAVPSVPPLVTSMPPPVSPSTAAAFG 
NPPVSHFPPSTSAPNTLLPAPPSGPPISGFSVGSTY 
DITRGHAGRAPQTPLMPSFSAPSGTGLLPTPITQQ 
ASLTSLAQGTGTTSAITFPEEQEDPRJTRGQDEAS 
AGGIWGFIKGVAGNPMVKSVLDKTKHSVESMIT 
TLDPGMAPYIKSGGELDIVVTSNKEVKVAAVRD 

a it/*m; a /i?m a \r\r\iriT* A n/^cxiT A DHD\/nV A A Cl\ vri 
AryJlVruLAVV VOEAUl^INlAryr VU 1 AAOJUKo 

AQERIDSLRRTGVIHEKQTAVSVENFIAELLPDK 

WFDIGCLVVEDPVHGIHLETFTQATPVPLEFVQQ 

AQSLTPQDYNLRWSGLLVTVGEVLEKSLLNVSR 

TOWHMAFTGMSRRQMIYSAARAIAGMYKQRLP 

PRTV 

3831 

A 

5 

674 

FWTRSAWHEGLQQMKANDPSLQEVNLYNIKNIP 

TUT I O CI? AT/" AT T?TXT'T r LJ\/T/ r T/" 17 OT A A TD CXTHOX/ A T A C 

IP 1 LKJbr AKALb 1 N 1 H VJsJvrbLAA 1 KbNDr VAIAr 
ADMLKVNTTLTSLNIESHFITGTGILALVEALKEN 
htt TPiirrnMnRnni c*ta vfmfiaoa/iT ffn^rtt 

U 1 JU 1 X-.lXVl.L'lN V^I\.v^V^l-»VJ 1 r\ V J^1V1J31/\V^1V1JjCC»1NOJV1Ju 

KFGYQFTKQGPRTRVAAAITKNNDLAWQKDTQ 
EQTSIWQVVSQSIAGFNPQFEVQGQNARSWMEE 
LGKAFHQFVRRELKQTEGKLP 

3832 

A 

164 

782 

EPWVPMDVAESPERDPHSPEDEEQPQGLSDDDIL 
RDSGSDQDLDGAGVRASDLEDEESAARGPSQEE 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E-Glutamic Acid, F=Phenylalanine, G^GIycine, H=Hishd»ne, 
I=Isoleucine, K=Lysine, L=Leucine, iM=Methionine, 
N=Asparagine, P=Proline, Q=G!utamine, R=Argininc, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Un known, *-Stop codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion 





EDNHSDEEDRASEPKSQDQDSEVNELSRGPTSSP 

CEEEGDEGEEDRTSDLRDEASSVTRELDEHELDY 

DEEVPEEPAPAVQEDEAEKAGAEDDEEKGEGTP 

REEGKAGVQSVGEKESLEAAKEKKKEDDDGEID 

DEEMY 

3833 

A 

122 

1676 

SQPPHFTQKMNENKDTDSKKSEEYEDDFEKDLE 

WLINENEKSDASIIEMACEKEENINQDLKENETV 

MEHTKJRHSDPDKSLQDEVSPRRNDIISVPGIQPLD 

PISDSDSENSFQESKLESQKDLEEEEDEEVRRY1M 

EKIVQANKLLQNQEPVNDKRERKLKFKDQLVDL 

EVPPLEDTTTSKNYFENERNMFGKLSQLCISNDF 

GQEDVLLSLTNGSCEENKDRTILVERDGKFELLN 

LQDIASQGFLPPINNANSTENDPQQLLPRSSNSSV 

SGTKKEDSTAKIHAVTHSSTGEPLAYIAQPPLNR 

KTCPSSAVNSDRSKGNGKSNHRTQSAHISPVTST 

YCLSPRQKELQKQLEEKREKLKREEERRK1EEEK 

EKKRENDIVFKAWLQKKREQVLEMRRIQRAKEI 

EDmNSRQENRDPQQAFRLWLKJCKHEEQMKER 

TEELRKQbbCLrFLrvO 1 bORbKAr* WLRKJvKjvI 

EKMAEQQAVRERTRQLRLEAKRSKQLQHHLYM 

SEAKPFRFTDHYN 

3834 

A 

575 

774 

RSRTEELSNSG1LKAMSKDLVTFGDVAVNFSQEE 
WEWLNPAQRNLYRKVMLENYRSLVSLGKDMSP 

3835 

A 

2 

100 

ASDFYLRYYVGHKGKFGHEFLEFEFRPDGVYV 

.3836 

A 

91 

749 

RPTPGHGDFWMQPLTKDAGMSLSSVTLASALQV 

RGEALSEEEIWSLLFLAAEQLLEDLRNDSSDYVV 

CPWSALLSAAGSLSFQGRVSHIEAAPFKAPELLQ 

GQSEDEQPDASQMHVYSLGMTLYWSAGFHVPP 

HQPLQLCEPLHSILLTMCEDQPHRRCTLQSVLEA 

CRVHEKEVSVYPAPAGLHTRRLVGLVLGTISEVS 

REPCFSSSSCWSCVAIKJ 

3837 

A 

3 

1214 

SLGCTNSARGKGQDDEVRTLMANGAPFTTDWFS 

KLRVSCGYIGDNCKNGADVNAKDMLKMTALH 

WATERHHRDVVELL1KYGADVHAFSKFDKSAFD 

IALEKNNAEILVILQEAMQNQVNVNPERANPVTD 

PVSMAAPFIFTSGEVVNLASLISSTNTKTTSGDPH 

ASTVQFSNSTTSVLATLAALAEASVPLSNSHRAT 

ANTEEIlbONSVDaSiQQ VMGSOOQRV1T1 V I DO V 

PLGNIQTSIPTGGIGHPFIVTVQDGQQVLTVPAGK 

VAEETVIKEEEEEKLPLTKKPRIGEKTNSVEESKE 

GNERELLQQQLQEANRRAQEYRHQLLKKEQEAE 

QYRLKLEA1ARQQPNGVDFTMVEEVAEVDAVV 

VTEGELEERETKVTGSAGATGPPTRVSMATVSS 

3838 

A 

1 

1332 

MIEDNKENKDHSLERGRASLIFSLKNEVGGLIKA 

LKIFQEKHVNLLHIESRKSKRRNSEFEIFVDCDIN 

REQLNDIFHLLKSHTNVLSVNLPDNFTLKEDGME 

TVPWFPKKISDLDHCANRVLMYGSELDADHPGF 

KDNVYRKRRKYFADLAMNYKHGDPIPKVEFTEE 

EIKTWGTWQELNKLYPTHACREYLKNLPLLSKY 

CO Yi\JlUJNJJrC^LbLi VoJNr LivbK 1 Or MrCr V AO Y Lor 

RDFLSGLAFRVFHCTQYVRHSSDPFYTPEPDTCH 

ELLGHVPLLAEPSFAQFSQEIGLASLGASEEAVQ 

KLATCYFFTVEFGLCKQDGQLRVFGAGLLSSISE 

LBCHALSGHAKVKPFDPKITCKQECLITTFQDVYF 

VSESFEDAKEKMREFTKTIKRPFGVKYNPYTRSI 
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SEQID 
NO: 

Method 

Predicted 
beginning 
nucleotide 
location 

rnrrpc nnnrfi no 

to first amino 
acid residue of 
peptide 
sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Ghitamic Acid, F=Phenylalanine, G^GIycine, H=Histidine, 
l-Isoleucine, K=Lysine, L= Leucine, M=Methionine, 
N=Asparagine, P=Prolinc, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, VV=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





QILKDTKSITSAMNELQHDLDVVSDALAKVSRKP 
SI 

3839 

A 

3093 

520 ! 

MVNFTVDQIRAINCDKKANIRNMSVIAHVDHGKS 

TLTDSL VCKAG IIAS ARAGETRFTDTRKDEQERCI 

TIKSTAISLFYELSENDLNFIKQSKDGAGFLINLID 

SPGHVDFSSEVTAALRVTDGALVVVDCVSGVCV 

QTETVLRQAIAERIKPVLMMNKMDRALLELQLE 

PEELYQTFQRIVENVNVIISTYGEGESGPMGNIMI 

DPVLGTVG FG SGLHG W AFTLKQFAEMY V AKF A 

AKGEGQLGPAERAKKVEDMMKKLWGDRYFDP 

ANGKFSKSATSPEGKKLPRTFCQLILDPIFKVFDA 

IMNFKKEETAKLIEKLDIKLDSEDKDKEGKPLLK 

AVMRRWLPAGDALLQMITIHLPSPVTAQKYRCE 

LLYEGPPDDEAAMGIKSCDPKGPLMMYISKMVP 

TSDKGRFYAFGRVFSGLVSTGLKVRJMGPNYTPG 

KKEDLYLKPIQRTILMMGRYVEPIEDVPCGNIVG 

LVGVDQFLVKTGTITTFEHAHNMRVMKFSVSPV 

VRVAVEAKNPADLPKLVEGLKRLAKSDPMVQCI 

IEESGEHIIAGAGELHLEICLKDLEEDHACIPIKKS 

DPVVSYRETVSEESNVLCLSKSPNKFINRLYMKA 

RPFPDGLAEDIDKGEVSARQELKQRARYLAEKY 

EWDVAEARKIWCFGPDGTGPNILTDITKGVQYL 

INJdIiSJJo V Vnur^WA I J\JiU/\J^VvCGlNtvll\Aj V Ivr u v 

HDVTLHADAIHRGGGQIIPTARRCLYASVLTAQP 
RLMEPIYLVEIQCPEQVVGGIYGVLNRKRGHVFE 
ESQVAGTPMFVVKAYLPVNESFGFTADLRSNTG 
GQAFPQCVFDHWQILPGDPFDNSSRPSQVVAETR 
KRKGLKEGIPA LDNFLDKL 

3840 

A 

2 

753 

SSTRSRDFCCSEAIQGSLTRRERRASGVRTRRSQG 
SSAMASKILLNVQEEVTCPICLELLTEPLSLDCGH 

oLl/KALl 1 V oIN rvi_»/\ V 1 ^IvivjvJlVoO^Jr v i_ajio i or ij 

HLQANQHLANIVERLKEVKLSPDNGKKRDLCDH 
HGEKLLLFCKEDRKVICWLCERSQEHRGHHTVL 
TEEVFKECQEKLQAVLKRLKKEEEEAEKLEADIR 
EEKTSWKYQVQTERQRIQTEFDQLRSILNNEEQR 
Fl ORT FFFFKKT 

3841 

A ' 

2 

405 

GKAFSCFTYLSQFIRRTHMAEKPYECKTCKKAFS 
HFGNLKVHERIHTGEKPYECKECRKAFSWLTCL 
I RHFRIHTfiKKSYFCOOCGKAFTRSRFLRGHEKT 
HTGEKMHECKECGKALSSLSSLHRHKRTHWRDT 
L 

'3842 

A 

311 

88 

AVLKNIvL\PMTALGLLDLHILNLILFLSAGEDFTS 
VVSEIMMYILLVFLTLWLLIEMIYCYRKVSKAEE 
AAQENA 

3843 

A 

3 

1175 

APIRNSRIDDFVRRVESKATSARCGLWGSGPRRR 

PASGMFRGLSSWLGLQQPVAGGGQPNGDAPPEQ 

PSETVAESAEEELQQAGDQELLHQAKDFGNYLF 

NFASAATKKITESVAETAQTIKXSVEEGKJDGIID 

KTIIGDFQKEQKKFVEEQHTKKSEAAVPPWVDT 

IWEETIQQQILALSADKilOTLRDPPAGVQFNFDF 

DQMYPVALV3VILQEDELLSKMRFALVPKLVKEE 

VFWRNYFYRVSLIKQSAQLTALAAQQQAAGKEE 

KSNGREQDLPLAEAVRPKTPPVVIKSQLKTQEDE 

EEISTSPGVSEFVSDAFDACNLNQEDLRKEMEQL 

VLDKKQEETAVLEEDSADWEKELQQELQEYEV 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteinc, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyIafanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutaminc, R=Arginine, S=Serine, 
T=Threoninc, V-Valine, W^Tryptopban, Y=Tyrosine, 
X-Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion 





VTESEKRDENWDKE1EKMLQEEN 

3844 

A 

798 

148 

LPPAQIPEAWLLLANVVVVLILVPLKDRLIDPLLL 

RCKLLPSALQKMALGMFFGFTSVIVAGVLEMER 

LHYIHHNETVSQQIGEVLYNAAPLSIWWQIPQYL 

LIGISEIFASIPGLEFAYSEAPRSMQGAIMGIFFCLS 

GVGSLLGSSLVALLSLPGGWLHCPKDFGNINNCR 

MDLYFFLLAGIQAVTALLFVWIAGRYERASQGP 

ASHSRFSRDRG 

3845 

A 

3 

1934 

PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHIS^rTl^MLQRXRLQTLMSVDDS 

METIYNMLVETGELDNTYIVYTADHGYHIGQFG 

LVKGKSMPYEFD1RVPFYVRGPNVEAGCLNPHIV 

LNIDLAPTILD1AGLDIPADMDGKSILKLLDTERP 

VNRPHLKKKMRVWRDSFLVERGKLLHKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCYILENDTVQCDLDLYKSLQ 

AWKDHKLHIDHEIETLQNKIKNLREVRGHLKKK 

RPEECDCHK1SYHTQHKGRLKHRGSSLHPFRKGL 

QEKDKVWLLR£QKRKiCKLRKl>LKJlLQNNDTCS 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

WITYWCMRTrNETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 

3846 

A 

3 

1934 

PEDSAPQYSRLFPNASQHITPSYNYAPNPDKHWI 

MRYTGPMKPIHMEFTOMLQRKJRLQTLMSVDDS 

METIYNMLVETGELDNTYWYTADHGYHIGQFG 

LVKGKSMPYEFDIRVPFYVRGPNVEAGCLNPHIV 

LNIDLAPTILDIAGLDIPADMDGKSILKLLDTERP 

VNRFHLKKKMRVWRDSFLVERGKLLrlKRDNDK 

VDAQEENFLPKYQRVKDLCQRAEYQTACEQLG 

QKWQCVEDATGKLKLHKCKGPMRLGGSRALSN 

LVPKYYGQGSEACTCDSGDYKLSLAGRRKKLFK 

KKYKASYVRSRSIRSVAIEVDGRVYHVGLGDAA 

QPRNLTKRHWPGAPEDQDDKDGGDFSGTGGLP 

DYSAANPIKVTHRCY1LENDTVQCDLDLYKSLQ 

AWKDHKLHIDHEIETLQNKIKNLREVRGHLKKK 

RPEECDCHKISYHTQHKGRLKHRGSSLHPFRKGL 

QEKDKV WLLREQKRKKKLRKLLKR^ 

MPGLTCFTHDNQHWQTAPFWTLGPFCACTSAN 

NNTYWCMRTINETHNFLFCEFATGFLEYFDLNT 

DPYQLMNAVNTLDRDVLNQLHVQLMELRSCKG 

YKQCNPRTRNMDLGLKDGGSYEQYRQFQRRKW 

PEMKRPSSKSLGQLWEGWEG 

3847 

A 

1 

1257 

MWSAVLTAFHTGTSNTTFVVYENTYMNITLPPP 
r(^llrlJLorLLKYoriilMAr 10LooL»l VJNo 1 AVr 1 1 
PAAFKSLNLPLQITLSAIMIFILFVSFLGNLVVCLM 
VYQKAAMRSAINILLASLAFADMLLAVLNMPFA 
LVTILTTRWIFGKFFCRVSAMFFWLFVIEGVAILL 
IISIDRFLI1VQRQDKLNPYRAKVLIAVSWATSFCV 
AFPLAVGNPDLQIPSRAPQCVFGYTTNPGYQAYV 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue or 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=CIutanuc Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, (Vl=Methionine, 
N=Asparagine, P-Proline, Q=Glutamine, R=Arginine,S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion 





TT TST TSFFTPFI VTT YSFMGT1 NTT RHNAT RIWWPF 

GICLSQASKLGLMGLQRPFQMSIDMGFKTRAFTT 

ILILFAVFIVCWAPFTTYSLVATFSKHFYYQHNFF 

EISTWLLWLCYLKSALNPLIYYWRIKKFHDACLD 

MMPKSFKFLPQLPGHTKRRIRPSAVYVCGEHRT 

VV 

3848 

A 

3 

2827 

SSAVAARRRRSWASLVLAFLGVCLGITLAVDRS 

NFKTCEESSFCKRQRSIRPGLSPYRALLDSLQLGP 

DSLTVHLIHEVTKVLLVLELQGLQKNMTRFRIDE 

LEPRRPRYRVPDVLVADPPIARLSVSGRDENSVE 

LTMAEGPYKIILTARPFRLDLLEDRSLLLSVNARG 

LLEFEHQRAPRVSQGSKDPAEGDGAQPEETPRD 

GDKPEETQGKAEKDEPGAWEETFKTHSDSKPYG 

PMSVGLDFSLPGMEHVYGIPEHADNLRLKVTEG 

GEPYRLYNLDVFQYELYNPMALYGSVPVLLAHN 

PHRDLGIFWLNAAETWVD1SSNTAGKTLFGKMM 

DYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISD 

VFRQYASLTGTQALPPLFSLGYHQSRWNYRDEA 

DVLEVDQGFDDHNLPCDVIWLDIEHADGKRYFT 

WDPSRFPQPRTMLERLA SKRRKLVAI VDPHIK VD 

SGYRVHEELRNLGLYVKTRDGSDYEGWCWPGS 

AGYPDFTNPTMRAWWANMFSYDNYEGSAPNLF 

VWNDMNEPSVFNGPEVTMLKDAQHYGGWEHR 

DVHNIYGLYVHMATADGLRQRSGGMERPFVLA 

RAFFAGSQRFGAVWTGDNTAEWDHLKISIPMCL 

SLGLVGLSFCGADVGGFFKNPEPELLVRWYQMG 

AYQPFFRAHAHLDTGRREPWLLPSQHNDIIRDAL 

GQRYSLLPFWYTLLYQAHREGIPVMRPLWVQYP 

QDVTTFNIDDQYLLGDALLVHPVSDSGAHGVQV 

Yl PnOGFVWVTYinwnKHVIOPnTT YT PVT1 <^TP 

VFQRGGTIVPRWMRVRRSSECMKDDPITLFVALS 

PQGTAQGELFLDDGHTFNYQTRQEFLLRRFSFSG 

NTLVSSSADPEGHFETPIWIERVVIIGAGKPAAW 

LQTKGSPESRLSFQHDPETSVLVLRKPGINVASD 

WSIHLR 

3849 

A 

1 

1717 

RARNARGCWGVCRSGFSSAVCGAARMEQVAEG 

ARVTAVPVSAADSTEELAEVEEGVGVVGEDNDA 

AARGAEAFGDSEEDGEDVFEVEKILDMKTEGGK 

VLYKVRWKGYTSDDDTWEPEIHLEDCKEVLLEF 

RKXIAENKAKAVRKDIQRLSLNNDIFEANSDSDQ 

QSETKEDTSPKKKKKKLRQREEKSPDDLKKKKA 

KAGKLKDKSKPDLESSLESLVFDLRTKKRISEAK 

EELKESKXPKKDEVKETKELKKVKKGEIRDLKT 

KTREDPKENRKTKKEKFVESQVESESSVLNDSPF 

PEDDSEGLHSDSREEKQNTKSARERAGQDMGLE 

HGFEKPLDSAMSAEEDTDVRGRRKKKTPRKAED 

TRENRKLENKNAFLEKKTVPKKQRNQDRSKSAA 

ELEKLMPVSAQTPKGRRLSGEERGLWSTDSAEE 

DKETKRNESKJCPKKDEVKETKELKKVKKGEIRD 

LKTKTREDPKENRXTKKEKFVESOVESESSVLND 

SPFPEDDSEGLHSDSREEKQNTKSARERAGQDM 

GLEHGFEKPLDSAMSAEEDTDVRGRRKKKTPRK 

AEDTRENRKLENKNAFLEKKTVPKKQRNQDRSK 

SAAELEKLMPVSAQTPKGRRLSGEERGLWSTDS 

AEEDKETKJINESKKPKKDEVKETKELKK\^G^ 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to flrst amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidinc, 
I=Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, VV=Tryptophan, Y=Tyrosine, 
X=Unknown, *-Stop codon, /-possible nucleotide deletion, 
V=possible nucleotide insertion 





1RDLKTKTREDPKENRKTKKEKFVESQVESESSV 
LNDSPFPED/RQ*RATFRQQR£EKSPDDLKKKKA 
KAG1CLKDKSKPDLESSLESLVFDLRTKKRISEAK 
EELKESKKPK 

3850 

A 

1113 

3975 

PAAAAAAAAAAAAAAGRGPSFTPCFSPSLAVEPS 

RRTRLGSDPAQAMAGNVKKSSGAGGGSGSGGS 

GSGGLIGLMKDAFQPHHHHHHHLSPHPPGTVDK 

KMVEKCWKLMDKVVRLCQNPKLALKNSPPYIL 

DLLPDTYQHLRTILSRYEGKMETLGENEYFRVF 

MENLMKKTKQTISLFKEGKERMYEENSQPRRNL 

TKLSLIFSHMLAELKGIFPSGLFQGDTFRJTKADA 

AEFWRKAFGEKTIVPWKSFRQALHEVHPISSGLE 

AMALKSTIDLTCNDYISVFEFDEFTRLFQPWSSLL 

RNWNSLAVTHPGYMAFLTYDEVK^RLQKFIHKP 

GSYIFRLSCTRLGQWAIGYVTADGNILQTIPHNKP 

LFQALIDGFREGFYLFPDGRNQNPDLTGLCEPTP 

QDHIKVTQEQYELYCEMGSTFQLCKICAENDKD 

VKIEPCGHLMCTSCLTSWQESEGQGCPFCRCEIK 

GTEPIVVDPFDPRGSGSLLRQGAEGAPSPNYDDD 

DDERADDTLFMMKELAGAKVERPPSPFSMAPQA 

SLPPVPPRLDLLPQRVCVPSSASALGTASKAASGS 

LHKDKPLPVPPTLRDLPPPPPPDRPYSVGAESRPQ 

RRPLPCTPGDCPSRDKLPPVPSSRLGDSWLPRPIP 

KVPVSAPSSSDPWTGRELTNRHSLPFSLPSQMEP 

RPDVPRLGSTFSLDTSMSMNSSPLVGPECDHPKI 

KPSSSANAIYSLAARPLPVPKLPPGEQCEGEEDTE 

YMTPSSRPLRPLDTSQSSRACDCDQQIDSCTYEA 

M Y JN lljbv^ A Pb 1 1 hob 1 rGEGNLAAAHANTGPEES 

ENEDDGYDVPKPPVPAVLARRTLSDISNASSS/FG 

LFVLERDP*PQNVTEGSQVPERPPKPFPRRINSER 

KAGSCQQGSGPAASAATA\SPQLSSE1ENLMSQG 

YSYQDIQKALVIAQNNIEMAKNILREFVSISSPAH 

VAT 

3851 

A 

2 

2781 

GRVGSMDGAMGPRGLLLCMYLVSLLILQAMPA 

LGSATGRSKSSEKRQ A VDTA VDG VF1RSLK VNC 

KVTSRFAHYVVTSQVVNTANEAREVAFDLEIPK 

TAFISDFAVTADGNAFIGDIKDKVTAWKQYRKA 

AISGENAGLVRASGRTMEQFTIHLTVNPQSKVTF 

QLTYEEVLKRNHMQYEIVIKVKPKQLVHHFE1DV 

DIFEPQGISKLDAQASFLPKELAAQTIKKSFSGKK 

GHVLFRPTVSQQQSCPTCSTSLLNGHFKVTYDVS 

RDKICDLLVANMHFAHFFAPQNLTNMNKNVVFV 

IDISGSMRGQKVKQTKEALLK1LGDMQPGDYFD 

LVLFGTRVQSWKGSLVQASEANLQAAQDFVRGF 

SLDEATNLNGGLLRGIEILNQVQESLPELSNHASI 

LIMLTDGDPTEGVTDRSQILKNVRNAIRGRFPLY 

NLGFGHNVDFNFLEVMSMENNGRAQRIYEDHD 

ATQQLQGFYSQVAKPLLVDVDLQYPQDAVLALT 

QNHHKQYYEGSEIVVAGR1ADNKQSSFKADVQA 

HGEGOEFSTTCLVDFFFMKTCI T RFRHHMT FNHV 

ERLWAYLTIQEL1JVKRMKVDREVRANLSSQALR 

MSLDYGFVTPLTSMSIRGMADQDGLKPTIDKPSE 

DSPPLEMLGPRRTFVLSALQPSPTHSSSNTQRLPD 

RVTGVDTDPHFIIHVPQKEDTLCFNINEEPGVILS 

LVQDPNTGFSVNGQLIGNKARSPGQHDGTYFGR 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alaninc C=Cystcine, D=Aspartic Acid, 
E>Glutamic Acid. F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleuctne, K=Lysine, L^Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glu famine, R-Argininc, S^Scrine, 
T=Threonine, V=Valine,W=Tryptophan, Y=Tyrosine, 
X=l)n known, *=Stop codon, /=possible nucleotide deletion, 
^possible nucleotide insertion 





LGIANPATDFQLEVTPQNITLNPGFGGPVFSWRD 

QAVLRQDGVVVTINKKRNLVVSVDDGGTIAEVV\ 

LHRVW\KGSS\VHQDFLGLLMCWDKSIGMSSPGR 

KGCWGQ\FFHPIRFLKVS*HPPPGSDPQKAQMPT 

MVVRNPPGLTVTVRGLQKDYSKDPWHGAEVSC 

WFI\HNNGA*I\TDCAYTDYI\VPDIF 

3852 

A 

39 

1735 

TQVAEAGRGEGVVAGAETGRPQSAGMNLELLES 

FGQNYPEEADGTLDCISMALTCTFNRWGTLLAV 

GCNDGRlVlW\DrALTRGIA*NKFSAHIHPVCSLC 

WSRDGHKLVSASTDNIVSQWDVLSGDCDQRFRF 

PSPILKVQYHPRDQNKVLVCPMKSAPVMLTLSD 

SKHVVLPVDDDSDLNVVASFDRRGEYIYTGNAK 

GK1LVLKTDSQDLVASFRVTTGTSNTTAIKSIEFA 

RKGSCFLINTADRIIRVYDGREILTCGRBGEPEPM 

QKLQDLVNRTPWKKCCFSGDGEYIVAGSARQH 

ALYIWEKSIGNLVK1LHGTRGELLLDVAWHPVRP 

IIASISSGVVSIWAQNQVENWSAFAPDFKELDEN 

VEYEERESEFDIEDEDKSEPEQTGADAAEDEEVD 

VTSVDPIAAFCSSDEELEDSKALLYLPIAPEVEDP 

EENPYGPPPDAVQTSLMDEGASSEKKRQSSADG 

SQPPKKKPKTTNIELQGVPNDEVHPLLGVKGDG 

KSKKKQAGRPKGSKGKEKDSPFKPKLYKGDRGL 

PLEGSAKGKVQAELSQPLTAGGAISELL 

3853 

A 

45 

2603 

PLLFTCGREVRARDPEKEGTIVVAGLKVQVQPRF 

LW1LCFSMEETQGELTSSCGSKTMANVSLAFRDV 

SIDLSQEEWECLDAVQRDLYKDVMLENYSNLVS 

LDLEYKYITKNLLSEKNVCKIYLSQLQTGEKSKN 

TIHEDTIFRNGLQCKHEFERQERHQMGCVSQMLI 

QKQISHPLHPKIHAREKSYECKECRKAFRQQSYL1 

QHLR1HTGERPYKCMECGKAFCRVGDLRVHHTI 

HAGERPYECKECGKAFRLHYHLTEHQR1HSGVK 

PYECKECGKAFSRVRDLRVHQT1HAGERPYECK 

ECGKAFRLHYQLTEHQRJHTGERPYECKVCGKT 

FRVQRHISQHQKIHTGVKPYKCNECGKAFSHGS 

YLVQHQK1HTGEKPYECKECGKSFSFHAELARH 

RRJHTGEKPYECRECGKAFRLQTELTRHHRTHTG 

EKPYECKECGKAFICGYQLTLHLRTHTGEDPYEC 

KECGKTFSSRYHLTQHYRIHTGEKPYICNECGKA 

FRLQGELTRHHRIHTCEKPYECKECGKAFIHSNQ 

FISHQRIHTSESTYICKECGKIFSRRYNLTQHFKIH 

TGEKPYICNECGKAFRFQTELTQHHRIHTGEKPY 

KCTECGKAFIRSTHLTQHHR1HTGEKPYECTECG 

KTFSRHYHLTQHHRGHTGEKPYICNECGNAFICS 

YRLTLHQRIHTGELPYECKECGKTFSRRYHLTQH 

FRLHTGEKPYSCKECGNAFRLQAELTRHHIVHTG 

EKPYKCKECGKAFSVNSELTRHHR1HTGEKPYQC 

KECGKAFmSDQLTLHQ\KIILVR\NPMHNVKRIR 

WPLENAL*QRICNLRNFLFVTEHVG1PFTSCSQFI 

RNYFVC 

3854 

A 

108 

894 

LQSCWVPGIPWPSVGWLSWLKDLPSCEIHSASLS 
AVLQGPQCSEMLWPKNLTSWDDSSSVSSGISDTI 
DNLSTDDINTSSS1SSYANTPASSRKNLDVQTDAE 
KHSQVERNSLWSGDDVKKSDGGSDSGIKMEPGS 
KWRRNPSDVSDESDKSTSGKKNPVISQTGSWRR 
GMTAQVGITMPRTKASAPAGALKTPGTGKRPGL 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Lcucinc, M=Methionine, 
INKAsparagine, P=Proline, Q=Clutaminc, R=Arginine, S=Serine, 
T=Thrconinc, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possibie nucleotide Insertion 





S\GPGAPTPAAPPQLARMAWAFSLSAASTPAVSP 
STSPSAVEGSPATILPLASSPPPRTTP*LPLSELTV* 
RPQELVRGRGCLGPGAPTPAAPPQLARMAWAFS 
LSAASTPAVSPSTSPSAVEGSPATILPLASSPPPRT 
TP 

3855 

A 

1 

772 

FRGGDGAPGVLKPGNPLPFPLPPLQYPPPSTLSHS 
DNLAMTSRSTARPNGQPQASKICQFKLVLLGESA 
VfikT<s9I VT RFVKfiOFWFYOF^TTrJA AFT TDWrf 

DDTTVKFEIWDTAGQERYHSLAPMYYRGAQAAI 

VVYDITNQETFARAKTWVKELQRQASP\SIVVGL 

AGNKADLANKRMVEYEEAQAYADDNSLLFMET 

SAKTAMNVNDLFL\AIA*EVAKRVNPQNLG\G\A 

AGRSRGVDLHEQS\QQNKSQCCSN 

3856 

A 

2815 

352 

LGLEAAARPRPGGPAAMQDGNFLLSALQPEAGV 

CSLALPSDLQLDRRGAEGPEAERLRAARVQEQV ! 

RARLLQLGQQPRHNGAAEPEPEAETARGTSRGQ 

YHTLQAGFSSRSQGLSGDKTSGFRPIAKPAYSPA 

SWSSRSAVDLSCSRRLSSAHNGGSAFGAAGYGG 

AQPTPPMPTRPVSFHERGGVGSRADYDTLSLRSL 

RLGPGGLDDRYSLVSEQLEPAATSTYRAFAYER 

QASSSSSRAGGLDWPEATEVSPSRTIRAPAVRTL 

QRFQSSHRSRGVGGAVPGAVLEPVARAPSVRSLS 

LSLADSGHLPDVHGFNSYGSHRTLQRLSSGFDDI 

DLPSAVKYLMASDPNLQVLGAAYIQHKCYSDAA 

AKKQARSLQAVPRLVKLFNHANQEVQRHATGA 

MRNL I YDN A DNKL A L VEENG IFELLRTLREQDDE 

LRKNVTGILWNLSSSDHLKDRLAKKTPLEVQLTVD 

LGV*APLSGAGGPP\LIQQNASEAE1FYNATGFPR 

NLSSASQATRQKMRECHGLVDALVTSINHALDA 

GKCEDKSVENAVCVLRNLSYRLYDEMPPSALQR 

LEGRGRRDLAGAPPGEVVGCFTPQSRRLRELPLA 

ADALTFAEVSKDPKGLEWLWSPQIVGLYNRLLQ 

i\a^i_,i_, in rvn i i jc»/\/A./vvj/\i>v^iNi i vHj\L/r Ivor oojuoIvJu 

ALEQERILNPLLDRVRTADHHQLRSLTGLIRNLS 

RNARNKDEMSTKVV\SHLI\EKLPGSVGEKSPPAE 

VLV\NI\1AVFNNLGWLASPI/ALARDLLYFDGLRK 

LMK1CKJRDSPDSEKSSRAASSLLANLWQYNKLH 

RDFRAKGYRKEDFLGP 

3857 

A 

1034 

204 

VAVTLLSQLPSAIQRTAAWEMRAPLTFRVPLALD 

LDCPEHCTVNVDNSLSIPVIAAELVVRKPSEKGM 

QQKKKTKDLGFRAGKESKTEWRK*GLQDMASQ 

MFAT Pf K*PVTA AFHD^^MP^^T T OTFMFOT FT F 

ARLQ/PDSKSEARRNQCDSMLLRNQQLCSTCQE 

MKMVQPRTMK1PDDPKASFENCMSYRMSLHQP 

KFQTTPEPFHDDIPTENIHLQNL/PILGPRTAVFHG 

LLTEAYKTLKERQRSSLPRKEPIGKTTEAVSGRSS 

SPPRLPERK 

3858 

A 

203 

3469 

SHQEIEQNSAMAPRKRGGRGISFIFCCFRNNDHPE 

ITYRLRNDSNFALQTMEPALPMPPVEELDVMFSE 

LVDELDLTDKHREAMFALPAEKKWOrY(^KJO: 

DQEENKGATSWPEFYIDQLNSMAARKSLLALEK 

EEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLS 

CILNFLKTMDYETSESRIHTSLIGCIKALMNNSQG 

RAHVLAHSESINVIAQSLSTENIKTKVAVLEILGA 

VCLVPGGHKKVLQAMLHYQKYASERTRFQTLIN 
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1 

Predicted 1 
?eginning 
nucleotide 
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corresponding 
to first amino 
acid residue of 
peptide j 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D-Aspartic Acid, 
i7>/-*i, Arid ir=Phf nvlafanine. G=GIvcine, ft=Histidine, 
L=Isoleucine, K=Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V-Vaiine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





DLDKSTGRYRDEVSLKTAIMSFINAVLSQGAGVE 

SLDFRLHLRYE\FLMLGIHPVMDKLRKHENSTLD 

RHLDFFEMLRNEDELEFAKRFELVHIDTKSATQM 

FELTRKRLTHSEAYPHFMSILHHCLQMPYKRSGN 

TVQYWLLLDRI1QQ1VIQNDKGQDPDSTPLENFNI 

KNVVRMLVNEl^VKQWKEQAEKMRKEHNELQ 

QBCLEKKERECDAKTQEKEEMMQTLNKMKEKLE 

KETTEHKQVKQQVADLTAQLHELSRRAVCASIP 

GGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGMLPP 

PPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALK 

KKSIPQPTNALKSFNWSKLPENKLEGTVWTEIDD 

TKVFK1LDLEDLERTFSAYQRQQDFFVNSNSKQK 

EADAIDDTLSSKLKVKELSV1DGRRAQNCNILLS 

RLKLSNDEIKRA1LTMDEQEDLPKDMLEQLLKFV 

PEKSDU3LLEEHKHELDRMAKADRFLFEMSRINH 

YQQRLQSLYFKKKFAERVAEVKPKVEA1RSGSEE 

VFRSGALKQLLEVVLAFGNYMNKGQRGNAYGF 

KISSLNKIADTKSSIDKNITLLHYLIT1VENKYPSV 

LNLNEELRDIPQAAKVNMTELDKEISTLRSGLKA 

VETELEYQKSQPPQPGDKFVSVVSQF1TVASFSFS 

DVEDLL AE AKDLFTKA V KHFG EE AGKIQPDEFF 

GIFDQFLQAVSEAKQENENMRKKKEEEERRARM 

EAQLKEQRERERKMRKAKENSEESGEFDDLVSA 

LRSGEVFDKDLSKLKRNRKRITNQMTDSSRERPI 

TKLNF 

3859 

A 

1279 

141 

RVEHLSEFLVDIKPSLTFDVIPLLDPYGPAGSDPS 

LEFLVVSEETYRGGMAINRFRLENDLEELALYQI 

QLLKDLRHTENEEDKVSSSSFRQRMLGNLLRPPY 

ERPELPTCLYVIGLTGISGSGKSSIAQRLKGLGAF 

VIDSDHLGHRAYAPGGPAYQPVVEAFGTDILHK 

DGIINRKVLGSRVFGNKKQLKILTDIMWPIIAKLA 

REEMDRAVAEGKRVCVIDAAVLLEAGWQNLVH 

EVWTAVIPETEAVRRTVERDGLSEAAAQSRLQSQ 

MSGQQLVEQSHVVLS-nCGSRlSPNARWRKPGPS 

CRS AFPRLIRPSTEKFSVGPDWLLELTSDPV VRRN 

GGLDAHPGSGPEVQAILCRTWPGLVDTGSLPNTL 

VFGQH 

3860 

A 

1 

3881 

MGQKS VGAS YVQIPL VPPLSRHPKGLGHEL>K W5 

SYCLSSLAAQNICTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEDLVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDVVEALSEEHMEADGHAAVVFGTVVDIISRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GKN1TFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINVVLA 

GGHVVPRDEIRKLMESQD1F 1 u 1 Q 1 fcUAUu^LL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 
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SEQID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine > D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possib!e nucleotide deletion, 
\=possible nucleotide insertion 





EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPIC 

AVELEGLAACEGEYSQKYSTMSPLGSGAFGFVW 

TAVDKEKNKEVVVK^IKKEKVLEDCWIEDPKXG 

KVTLEIAlLSRVEHANIIKVLDiFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKDIIHRDIKDENIV1AEDFTIKLI 

DrObAAYLbRuKLr Y 1 bCO 1 IbYCArbVLMONFY 

RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 

3861 

A 

1 

3881 

MGQKSVGASYVQIPLVPPLSRHPKGLGHEDRWS 

SYCLSSLAAQN1CTSKLHCPAAPEHTDPSEPRGSV 

SCCSLLRGLSSGWSSPLLPAPVCNPNKAIFTVDA 

KTTEILVANDKACGLLGYSSQDLIGQKLTQFFLR 

SDSDWEALSEEHMEADGHAAVVFGTVVDIISRS 

GEKIPVSVWMKRMRQERRLCCVVVLEPVERVST 

WVAFQSDGTVTSCDSLFAHLHGYVSGEDVAGQ 

HITDLIPSVQLPPSGQHIPKNLKIQRSVGRARDGT 

TFPLSLKLKSQPSSEEATTGEAAPVSGYRASVWV 

FCTISGLITLLPDGTIHGINHSFALTLFGYGKTELL 

GBCNITFLIPGFYSYMDLAYNSSLQLPDLASCLDV 

GNESGCGERTLDPWQGQDPAEGGQDPRINVVLA 

GGHVVPRDEIRKLMESQDIFTGTQTEL1AGGQLL 

SCLSPQPAPGVDNVPEGSLPVHGEQALPKDQQIT 

ALGREEPVAIESPGQDLLGESRSEPVDVKPFASCE 

DSEAPVPAEDGGSDAGMCGLCQKAQLERMGVS 

GPSGSDLWAGAAVAKPQAKGQLAGGSLLMHCP 

CYGSEWGLWWRSQDLAPSPSGMAGLSFGTPTLD 

EPWLGVENDREELQTCLIKEQLSQLSLAGALDVP 

HAELVPTECQAVTAPVSSCDLGGRDLCGGCTGS 

SSACYALATDLPGGLEAVEAQEVDVNSFSWNLK 

ELFFSDQTDQTSSNCSCATSELRETPSSLAVGSDP 

DVGSLQEQGSCVLDDRELLLLTGTCVDLGQGRR 

FRESCVGHDPTEPLEVCLVSSEHYAASDRESPGH 

VPSTLDAGPEDTCPSAEEPRLNVQVTSTPVIVMR 

GAAGLQREIQEGAYSGSCYHRDGLRLSIQFEVRR 

VELQGPTPLFCCWLVKDLLHSQRDSAARTRLFL 

ASLPGSTHSTAAELTGPSLVEVLRARPWFEEPPK 

AVFT PHI A APFfiFY^nifY^TM^PT n^nAFHFVW 

TAVDKEKNKEVVVKFIKKEKVLEDCWIEDPKLG 

KVTLEIAILSRVEHANIIKVLDIFENQGFFQLVME 

KHGSGLDLFAFIDRHPRLDEPLASYIFRQVRAG\Q 

SRLVSAVGYLRLKDIIHRDIKDENIVIAEDFTIKLI 

DFGSAAYLERGKLFYTFCGTIEYCAPEVLMGNPY 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of - 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=G\utannne, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y-Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





RGPELEMWSLGVTLYTLVFEENPFCELEETVEAA 

IHPPYLVSKELMSLVSGLLQPVPERRTTLEKLVT 

DPWVTQPVNLADYTWEEVFRVNKPESGVLSAAS 

LEMGNRSLSDVAQAQELCGGPVPGEAPNGQGCL 

HPGDPRLLTS 

3862 

A 

399 

2069 

TMDRSKRNSIAGFPPRVE\RLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAVVADFGLAEKIPDVSMGSEKLA 

VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARJQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDG AARTPK VNPF S ARQDLMGGKIKF r DLPbK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKE1PPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 

3863 

A 

399 

2069 

TMDRSKRNSIAGFPPRVE\RLEEFEGGGGGEGNV 

SQVGRVWPSSYRALISAFFRLTRLDDFTCEKIGSG 

FFSEVFKVRHRASGQVMALKMNTLSSNRANML 

KEVQLMNRLSHPNILRYINSGNLEQLLDSNLHLP 

WTVRVKLAYDIAVGLSYLHFKGIFHRDLTSKNC 

LIKRDENGYSAWADFGLAEKIPDVSMGSEKLA 

VVGSPFWMAPEVLRDEPYNEKADVFSYGIILCEII 

ARIQADPDYLPRTENFGLDYDAFQHMVGDCPPD 

FLQLTFNCCNMDPKLRPSFVEIGKTLEEILSRLQE 

EEQERDRKLQPTARGLLEKAPGVKRLSSLDDKIP 

HKSPCPRRTIWLSRSQSDIFSRKPPRTVSVLDPYY 

RPRDGAARTPKVNPFSARQDLMGGKJKJFFDLPbK 

SVISLVFDLDAPGPGTMPLADWQEPLAPPIRRWR 

SLPGSPEFLHQEACPFVGREESLSDGPPPRLSSLK 

YRVKEIPPFRASALPAAQAHEAMDCSILQEENGF 

GSRPQGTSPCPAGASEEMEVEERPAGSTPATFSTS 

GIGLQTQGKQDG 

3864 

A 

3 

911 

SWNMDSDSCAAAFHPEEYSPSCKRRRTVEDFNK 

FCTFVLAYAGYIPYPKEELPLRSSPSPANSTAGTI 

DSDGWDAGFSDIASSVPLPVSDRCFSHLQPTLLQ 

t> a vdcmtt t t\x> i/fttm/t vvi^t^i^ovd dhcha vfiff 
RAKPoNr LLJJKKK 1 DJ^JsJsJMvrUuvKK^ 

EGYRGGLLKLEAADPYVETPTSPTLQDIPQAPSD 

PCSGWDSDTPSSGSCATVSPDQVKEIKTEGKRTI 

VR/QEAQLMARNDGNFSSLLESIFPS\DDDSWDLV 

TCFCMKPFAGRPMIECNECHTWIHLSCAKIRKSN 

VPEVFVCQKCRDSKFDIRRSNRSRTGSRKLFLD 

3865 . 

A 

3 

3573 

QERLRSRSRPDRAAREAGSARGRQPKRTERVEQ 

EGSVESASETRSGPQSASTAVKERPASSEKVKGG 

DDHDDTSDSDSDGLTLKELQNRLRRKREQEPTE 

RPLKGIQSRLRKKRREEGPAETVGSEASDTVEGV 

LPSKQEPENDQGWSQAGKDDRESKLEGKAAQD 

IKDEEPGDLGRPKPECEGYDPNALYCICRQPHNN 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenyla!anine, G=Glycine, HHHistidine, 
I=IsoIeucinc, K=Lysinc, L=Leucine, M=Methionine, 
N=Asparagine, P^Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosine, 
X=Unknown, *~Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





RFMICCDRCEEWFHGDCVGISEARGRLLERNGE 

DYICPNCTILQVQDETHSETADQQEAKWRPGDA 

DGTDCTSIGTIEQKSSEDQGIKGRIEKAANPSGKK " 

KLKIFQPGPGPVPTQLPVLWQVLEIAVSRSISAFT 

LLHCISCKVIEAPGASKCIGPGCCHVAQPDSVYCS 

NDCILKHAAATMKPLSSGKEQKPKPKEKMKMK 

PEKPSLPKCGAQAGIKISSVHKEPAPEKKETTVK 

KAVVVPARSEALGKEAACESSTPSWASDHNYNA 

VKPEKTAAPSPSLLYKSTKEDRRSEEKAAATAAS 

KKTAPPGSTVGKQPAPRNLVPKKSSFANVAAAT 

PAIKXPPSGFKGTIPKRPWLSATPSSGASAARQAG 

PAPAAATAASKXFPGSAALVGAVRKPVVPSVPM 

ASPAPGRLGAMSAAPSQPNSQIRQNIRRSLKEIL 

WK/RFLFFILFRVNDSDDLIMTENEVGKIALHIEK 

EMFNLFQVTDN/RAYKSKYRSIMFNLKDPKNQG 

LFHRVLREEISLAKLVRLKPEELVSKELSTWKER 

PARSVMESRTKLHNESKKTAPRQEAIPDLEDSPP 

VSDSEEQQESARAVPEKSTAPLLDVFSSMLKDTT 

SQHRAHLFDLNCKICTGQVPSAEDEPAPKKQKLS 

ASVKKEDLKSKHDSSAPDPAPDSADEVMPEAVP 

EVASEPGLESASHPNVDRTYFPGPPGDGHPEPSPL 

EDLSPCPASCGSGVVTTVTVSGRDPRTAPSSSCT 

AVASAASRPDSTHMVEARQDVPKPVLTSVMVPK 

SILAKPSSSPDPRYLSVPPSPNISTSESRSPPEGDTT 

LFLSRLSTIWKGFINMQSVAKFVTKAYPVSGCFD 

YLSEDLPDTIHIGGR1APKTVWDYVGKLKSSVSK 

ELCLIRFHPATEEEEVAYISLYSYFSSRGRFGVVA 

NNNRHVKDLYLIPLSAQDPVPSKLLPFEGPGKRR 

LSGWR 

3866 

A 

2 

3181 

AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLVVIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRILWLTTRQ 

G\LPFTIFQGGMPRASYGDRHCISVIHDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALWLAEEEL 

VVJDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

IPLKLWERIIAAGSRQNAHFSTMEWIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDEWPPLRKVGSF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKXSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEGSAK^ERPGLQNMELAPV 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAXEIQLMHRAPVVGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLVVSEEQFKVFTLPKVSAK 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=G!utamic Acid, F=Phenyla1anine, G=Glycine, H=Histidine, 
J=Jsoleucine, K=Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine,S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





T TcTT V"T TAT PnQRVRUVWAHPfi^RR APnvnPT4U 
J-.1VHSX. 1 /\Lcuoi\ v ivJtv V o v /\nr woivrWAJDly i oiirirl 

LAVLTNLGDIQVVSLPLLKPQVRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKGNLVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHIEPPWGA 

ASAMAEQSEWLSVQAAR 

3867 

A 

2 

3181 

AQQPVGRRGGASGAGGGRRGTPRPRAGAGPGF 

QVSSGGCRLSKMRRFLRPGHDPVRERLKRDLFQ 

FNKTVEHGFPHQPSALGYSPSLRILAIGTRSGAIK 

LYGAPGVEFMGLHQENNAVTQIHLLPGQCQLVT 

LLDDNSLHLWSLKVKGGASELQEDESFTLRGPP 

GAAPSATQITVVLPHSSCELLYLGTESGNVFVVQ 

LPAFRALEDRTISSDAVLQRLPEEARHRRVFEMV 

EALQEHPRDPNQILIGYSRGLVVIWDLQGSRVLY 

HFLSSQQLENIWWQRDGRLLVSCHSDGSYCQWP 

VSSEAQQPEPLRSLVPYGPFPCKAITRJLWLTTRQ 

G\LPFTIFQGGMPRASYGDRHCISV1HDGQQTAFD 

FTSRVIGFTVLTEADPAATFDDPYALVVLAEEEL 

VVIDLQTAGWPPVQLPYLASLHCSAITCSHHVSN 

1PLKLWERIIAAGSRQNAHFSTMEWPIDGGTSLTP 

APPQRDLLLTGHEDGTVRFWDASGVCLRLLYKL 

STVRVFLTDTDPNENLSAQGEDE WPPLRK VG SF 

DPYSDDPRLGIQKIFLCKYSGYLAVAGTAGQVLV 

LELNDEAAEQAVEQVEADLLQDQEGYRWKGHE 

RLAARSGPVRFEPGFQPFVLVQCQPPAVVTSLAL 

HSEWRLVAFGTSHGFGLFDHQQRRQVFVKCTLH 

PSDQLALEGPLSRVKSLKKSLRQSFRRMRRSRVS 

SRKRHPAGPPGEAQEG S AKAERPGLQNMELAP V 

QRKIEARSAEDSFTGFVRTLYFADTYLKDSSRHC 

PSLWAGTNGGTIYAFSLRVPPAERRMDEPVRAE 

QAKEIQLMHRAPVVGILVLDGHSVPLPEPLEVAH 

DLSKSPDMQGSHQLLVVSEEQFKVFTLPKVSAK 

T V T VI TAT Rr.Ql?yDPVCV AT-TEVJQPP A T7TYVY"5FT-n-I 

LAVLTNLGDIQVVSLPLLKPQVRYSCIRREDVSGI 

ASCVFTKYGQGFYLISPSEFERFSLSTKGMLVEPRC 

LVDSAETKNHRPGNGAGPKKAPSRARNSGTQSD 

GEEKQPGLVMERALLSDERAATGWHIEPPWGA 

ASAMAEQSEWLSVQAAR 

3868 

A 

1 

2497 

GDSGGPLVCEEPSGRFFLAGIVSWGIGCAEARRP 

GVYARVTRLRDWILEATTKASMPLAPTMAPAPA 

APSTAWFTSPESPVVSTPTKSMQALSTVPLDWVT 

VPKLQECGARPAMEKPTRVVGGFGAASGEVPW 

QVSLKEGSRHFCGATVVGDRWLLSAAHCFNHT 

KVEQVRAHLGTASLLGLGGSPVK1GLRRVVLHP 

LYNPGILDFDLAVLELASPLAFNKYIQPVCLPLAI 

QKFPVGRKCMISGWGNTQEGNATKPELLQKASV 

GI1DQKTCSVLYNFSLTDRMICAGFLEGKVDSCQ 

VSGIKALYESELADARRVLDETARERARLQIEIG 

KLRAELDEVNKSAKKREGELTVAQGRVKDLESL 

FHP<?pvFT A A AT ^DKRnT F<UWAFI ftAOT AKAF 

DGHAVAKKQLEKETLMRVDLENRCQSLQEELDF 

RKSVFEEEVRETRRRHERRLVEVDSSRQQEYDFK 

MAQALEELRSQHDEQVRLYKLELEQTYQAKLDS 

AKLSSDQNDKAASAAREELKEARMRLESLSYQL 

SGLQKQASAAEDRIRELEEAi^GERDCTRKMLD 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=lsoIeucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=ProIine, Q=G)utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





AKEQEMTEMRDVMQQQLAEYQELLDVKLALD 

MEINAYRKLLEGEEERLKLSPSPSSRVTVSRATSS 

SSGSLSATGRLGRSKJIKR\WRWRSPW\QRPKRPG 

QLKNNSDKDQSLGN WPJKRQ V LEGEEIA YKFTP 

KYILRAGQMVTVWAAGAGVAHSPPSTLVWKGQ 

SSWGTGESFRTVLVNADGEEVAMRTVKKSSVM 

RENENGEEEEEEAEFGEEDLFHQQGDPRTTSRGC 

YVM 

3869 

A 

1 

1942 

R YRAG IPG DG RKD Y IRLTRPG LTLPG RA MF ARG S 

RRRRSGRAPPEAEDPDRGQPCNSCREQCPGFLLH 

GWRKICQHCKCPREEHAVHAVPVDLERIMCRLIS 

DFQRHSISDDDSGCASEEYAWVPPGLKPEQVYQ 

FFSCLPEDKVPYVNSPGEKYRIKQLLHQLPPHDS 

EAQYCTAL\EE\EEKKELRAFSQQRJCRENLG/RLG 

IVRIFPVTIT\GAI\CEECGKQIGGGDIAVF\ASRASL 

GLLLGQPSCFWCTTCQELLVDLIYFYHVGKVYC 

GRHHAECLRPRCQACDEIIFSPECTEAEGRHWHM 

DHFCCFECEASLGGQRYVMRQSRPHCCACYEAR 

HAEYCDGCGEHIGLDQGQMAYEGQHWHASDRC 

FCCSRCGRALLGRPFLPRRGLIFCSRACSLGSEPT 

APGPSRRSWSAGPVTAPLAASTASFSAVKGASET 

TTKGTSTELAPATGPEEPSRFLRG A PHRHSMPEL 

RDPLVSEGGPRRTLSAPPAQRRRPRSPPPRAPSRR 

RHHHHNHHHHHNRHPSRRRHYQCDAGSGSDSE 

SCSSSPSSSSSESSEDDGFFLGERIPLPPHLCRPMP 

AQDTAMETFNSPSLSLPRDSRAGMPRQARDKNC 

IVA 

3870 

A 

2 

3485 

FVWRVFYVHASCMPPRARSWEGAHAPVGMHV 

AEAHACSSQQQQMPPAQFWMLEWLLHLCAFLS 

TPSFPHWCCCSNPHGSIADKPEEIVPASKPSRAAE 

NMAVEPRVATIKQRPSSRCFPAGSDMNSVYERQ 

GIAVMTPTVPGSPKAPFLGIPRGTMRRQKSIDSRI 

FLSGITEEERQFLAPPMLKFTRSLSMPDTSEDIPPP 

PQSVPPSPPPPSPTTYNCPKSPTPRVYGTIKPAFNQ 

NSAAKVSPATRSDTVATMMREKGMYFRRELDR 

YSLDSEDLYSRNAGPQANFRNKRGQMPENPYSE 

VGKIASKAVYVPAKPARRKGMLVKQSNVEDSPE 

KTCSIPIPTUVKEPSTSSSGKSSQGSSMEIDPQAPE 

PPSQLRPDESLTVSSPFAAAIAGAVRDREKRLEA 

RKNSPAFLSADLGDEHVGLGPPAPRTRPSMFPEE 

GDFADEDSAEQLSSPMPSATPREPENHFVGGAEA 

SAPGEAGRPLNSTSKAQGPESSPAVPSASSGTAG 

PGNYVHPLTGRLLDPSSPLALALSARDRAMKES 

QQGPKGEAPKADLNKPLYIDTKMRPSLDAGFPT 

VTRQNTRGPLRRQETENKYETDLGRDRKGDDK 

KNMLIDIMDTSQQKSAGLLMVHTVDATKLDNA j 

LQEEDEKAEVEMKPDSSPSEVPEGVSETEGALQI 

SAAPEPTTVPGRTIVAVGSMEEAVILPFRIPPPPLA 

SVDLDEDFIFTEPLPPPLEFANSFDIPDDRAASVPA 

LSDLVKQKKSDTPQSPSLNSSQPTNSADSKKPAS 

LSNCLPASFLPPPESFDAVADSG1EEVDSRSSSDH 

HLETTSTISTVSSISTLSSEGGENVDTCTVYADGQ 

AFMVDKPPVPPKPKMKPIIHKSNALYQDALVEE 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L^Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine,S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/= possible nucleotide deletion, 
\=possible nucleotide insertion 





DVDSFVIPPPAPPPPPGSAQPGMAKVLQPRTSKL 

WGDVTEIKSPILSGPKANVISELNSILQQMNREKL 

AKPGEGLDSPMGAKSASLAPRSPEIMST1SGTRST 

TVTFTVR PGTSOPTTT OSRPPD YFSRTSGTR RAPS 

PVVSPTEMNKETLPAPLSAATASPSPALSDVFSLP 

SQPPSGDLFGLNPAGRSRSPSPSILQQPISNKPFTT 

KPVHLWTKPDVADWLESLNLGEHKEAFMDNEI 

DGSHLPNLQKEDLIDLGVTRVGHRMNIERALKQ 

LLDR 

3871 

A 

35 

1171 

VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNKHiEDGMGRNLADRCTD 

EVNALVLQTQQEDENLKPLLPAGIQDKLHTLIPC 

KJKJDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TPATPTYNJASOFFT MTTT VTOT ASVTSRTSMOTTTV 

GGV1WKTIGWKLLSVSLTMYGALYLYERLSWTT 

HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 

VKQQIATTFARLCQQVDITQKQLEEEIARLPKE1D 

QLEKIQNNSKLLRNKAVQLENELENFTKQFLPSS 

NEES 

3872 

A 

35 

1171 

VESRSAWHEGEDQIDRLDFIRNQMNLLTLDVKK 

KIKEVTEEVANKVSCAMTDEICRLSVLVDEFCSE 

FHPNPDVLKIYKSELNBCHIEDGMGRNLADRCTD 

EVNALVLQTQQEIIENLKPLLPAGIQDKLHTL1PC 

KKFDLSYNLNYHKLCSDFQEDIVFRFSLGWSSLV 

HRFLGPRNAQRVLLGLSEPIFQLPRSLASTPTAPT 

TP A TPTYW A SOFFT MTTT VTHT ASVTSRTSMHTTTV 

GGVIWKTIGWKLLSVSLTMYGALYLYERLSWTT 
HAKERAFKQQFVNYATEKLRMIVSSTSANCSHQ 
VKQQIATTFARLCQQVDITQKQLEEE1ARLPKEID 
QLEKIQNNSKLLKNKAVQLENELENFTKQFLPSS 
NEES 

3873 

A 

2944 

2089 

PVCTALTPGRMTDDKDVLRDVWFGRIPTCFTLY 

QDEITEREAEPYYLLLPRVSYLTLVTDKVKKHFQ 

KVMRQEDISEIWFEYEGTPLKWHYPIGLLFDLLA 

SSSAT PWT^TVHFKSFPFICDLLHrPSKDArEAHF 

MSCMKEADALKIiKSQVINEMQKKDHKQLWMG 

LQNDRFDQFWAINRKLMEYPAEENGFRYIPFRIY 

QTTTERPFIQKLFRPVAADGQLHTLGDLLKEVCP 

SAIDPEDGEKKNQVMIHGIEPMLETPLQWLSEHL 

SYPDNFLHISIIPOPTD 

in/ A A J— 'A ~ A A-/ A AAt*/AJLJL A. A A-^ 

3874 

A 

776 

366 

QARGAPSSPMCPLPLAAAAVAAPRAPLRLLNRG 

LAAAMSTAQSLKSVDYEVFGRVQGVCFRMYTE 

DEARKIGVVGWVKNTSKGTVTGQVQGPEDKVN 

SMKSWLSKVGSPSSRIDRTNFSNEKTISKLEYSNF 

SIRY 

3875 

A 

1081 

182 

SLSSCQTDPRPMSAPLDAALHALQEEQARLKMR 

LWDLQQLRKELGDSPKDKVPFSVPKIPLVFRGHT 

OODPEVPKSLVSNLRIHCPLLAGSALITFDDPKVA 

EQVLQQKEHTOMEECRLRVQVQPLELPMVTTIQ 

VMVSSQLSGRRVLVTGFPASLRLSEEELLDKLEIF 

FGKTRNGGGDVDVRELLPGSVMLGFARDGVAQ 

RLCQIGQFTVPLGGQQVPLRVSPYVNGEIQKAEI 

RSQPVPRSVLVLNIPDILDGPELHDVLEIHFQKPT 
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SEQ ID 

NO: 

Method 

Predicted 
beginning 
nucleotide 

lnr"i tirtn 

[UtdUUil 

corresponding 
to first amino 
acid residue or 
peptide 
sequence 

Predicted end 
nucleotide 
location 
rnrrp^nnnrlino 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Asparlic Acid, 
E=GIutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L= Leucine, M=Methionine, 
N=A^narapinp P=Profine 0=(»lii fa mini* R=AroinJn<» c— Qpri np 
T=Thrconinc, V=Valine, \V=Tryptophan, Y=Tyrosine, 
XMUnknown, *=Stop codon,/=possible nucleotide deletion, 
\=possible nucleotide insertion j 





RGGGEVEALTVVPQGQQGLAVFTSESG 

3876 

A 

26 

431 

RMMKCPQALLAIFWLLLSWVSSEDKVVQSPLSL 
VVHEGDTVTLNCSYEVTNFRSLLWYKQEKKAPT 
FLFMLTSSGIEKKSGRLSSILDKKELSSILNITATQ 
TGDSAIYLCAVEAQCSLVTCSLYSNSTAEALQL 

3877 

A 

3 

1291 

KAFRLLAERGAAAAMLWSGCRRFGARLGCLPG 

GLRVLVQTGHRSLTSCIDPSMGLNEEQKEFQKV 

AFDFAAREMAPNMAEWDQKELFPVDVMRKAA 

QLGFGGVYIQTDVGGSGLSRLDTSVIFEALATGC 

TSTTAYISIHNMCAWMIDSFGNEEQRHKFCPPLC 

TMEKJFASYCLTEPGSGSDAASLLTSAKKQGDHYI 

LNGSKAFISGAGESDIYVVMCRTGGPGPKGISCIV 

VEKGTPGLSFGKKEKKVGWNSOPTRAVIFEDCA 

VPVANRIGSEGQGFLIAVRGLNGGRJNIASCSLGA 

AHASVILTRDHLNVRKQFGEPLASNQYLQFTLA 

DMATRLVAARLMVRNAAVALQEERKDAVALCS 

MAKLFATDECFAICNQALQMHGGYGYLKDYAV 

QQYVRDSRVHQILEGSNEVMRILISRSLLQE 

3878 

A 

10 

1014 

LPGSTISSSGCQAPGRADSSGGARNSRRGDSRPG 

SCNRQAVAPPCPSPGPQSRHWIHRGTAPQAGETR 

TLGRGSSAPNACSASVTPCCPSSPPS*SCL*PTRRS 

PQNSSSTEVYRGFWQHGLPST**PFSS*QWPGQH 

TOGOSKI I GKOTTHl PCSTWPA**PSPSC1 TRFR* 

W*PSLMCLWASSCSVCV*SPSGSCRH*LWGTHST 

SRTC*ARRSSALPTGLCTDDTSWASSSKARPCAL 

QRPSSLSSLSPCLTC*W*LSSSSPMSARSPAGAET 

GSWATGSPRLTQWKSSRLTSTSHSARSAWKJPSA 

TESTPSWPRFSSWTSGEDPASPAPAI 

3879 

A 

200 

699 

LLLTGYIQTLQNQQLSGNQQEMQAVDNLTSAPG 

NTSLCTRDYKITQVLFPLLYTVLFFVGLITNGLA 

MRlFFQIRSKSNFnFLKNTVISDLLMlLTFPFKILS 

DAKLGTGPLRTFVCQVTSVIFYFTMYISISFLGLIT 

IDRYQKTTRPFKTSNPKNLLGAKILK 

3880 

A 

26 

169 

QPETDTMVHLTPEEKSAVTALWGKVNVDEDAG 
DDLCQILVDRPRLRI 

3881 

A 

37 

1100 

TPLFDFWPGFVLSWLQPLSASLRARRAASGPPAC 

RIMPTTVDDVLEHGGEFHFFQKQMFFLLALLSAT 

FAPIYVGIVFLGFTPDHRCRSPGVAELSLRCGWSP 

AEELNYTVPGPGPAGEASPRQCRRYEVDWNQST 

FDCVDPLASLDTNRSRLPLGPCRDGWVYETPGSS 

IVTEFNLVCANSWMLDLFQSSVNVGFFIGSMSIG 

YIADRFGRKLCLLTTVLINAAAGVLMAISPTYTW 

MLIFRLIQGLVSKAGWLIGYILITEFVGRRYRRTV 

GIFYQVAYTVGLLVLAGVAYALPHWRWLQFTV 

ALPNFFFLLYYWCIPESPRWLISQNKNAEAMRIIK 

HIAKKNGKSLPASL 

3882 

A 

573 

1620 

KSKCRFPEGLSEGFGPMRKEALSSGSVQEAEAM 

LDEPQEQAEGSLTVYVISEHSSLLPQDMMSY1GP 

KRTAVWGIlVllIREAl^GRRJVQVAQAMSLTED 

VLAAALADPILPEDKWSAEKRRPLKSSLGYEITFS 

LLNPDPKSHDVYWDffiGAVRRYVQPFLNALGAA 

GNFSVDSQILYYAMLGVNPRFDSASSSYYLDMH 

SLPHVINP VESRLGS S AASLYP VLNFLLYVPELAH 

SPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDS 

KTYNASVLPVRVEVDMVRVMEVFLAQLRLLFGI 
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SEQ ID 

NO: 

Method 

Predicted 
beginning 
nucleotide 
location 

currcs pointing 

to first amino 
acid residue of 
peptide 
sequence 

Predicted end 
nucleotide 
location 
corresponding 

tf\ liicf amino 

IU Idol. <1I11I1IU 

acid residue of 

peptide 

sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G-Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L= Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Thrf k oninp V=Valine W=Trvntonhan. Y=Tv rosin e. 
X=Unknown, *=Stop codon, /= possible nucleotide deletion, 
\=possible nucleotide insertion 





AQPQLPPKCLLSGPTSEGLMTWELDRLLWARSV 
ENLATATTTLTSLA 

3883 

A 

2369 

844 

RIHREEDFQFILKGIARLLSNPLLQTYLPNSTKKIQ 

FHQELLVLFWKLCDFNKVGQPRGALQGDGEQLP 

Q*PGGRDSVRLRGVGQSCPSLELSPLGPSPHP*KF 

LFFVLKSSDVLDILVPILFFLNDARADQSRVGLM 

HIGVFILLLLSGECNFGVRLNKPYSIRVPMDIPVF 

TGTHADLLIVWFHKIITSGHQRLQPLFDCLLTIVV 

NVSPYLKSLSMVTANKLLHLLEAFSTTWFLFSAA 

QNHHLVFFLLEVFNNIIQYQFDGNSNLVYAIIRKR 

SIFHQLANLPTDPPT1HKALQRRRRTPEPLSRTGS 

♦PRSPRCQRMAPCGPWNLSPSRAWRMAARLRGS 

PARHGGSSGDRP/HSSASGQWSPTPEWVLSWKS 

KLPLQTIMRLLQVLVPQVEKICIDKGLTDESEILR 

FLQHGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVIYLRNVDPPVWYDTDVKLFEIQRV 

3884 

A 

1 

804 

NGPRAPFSQEGQSTGPPPLIPRLGQHGAQGRIPPL 
NPGQGPGPNKDDSRGPPNHHMGPMSERRHEQSG 
GPEHGPERGPLRGGQDGRuPPDKKGPHPDr ruub 
SRPDDFHPDKRFGHRLREFEGRGGPLPQEEKWR 
RGGPGPPFPPDHREFSEGDGRGAARGPPGAWEG 
RRPGG*TFPPGSRGPTFS/SGAEEESFRRGAPPRHE 
GRAPPRGRDGFPGPEDFGPEENFDASEEAARGRD 
LRGRGRGTPRGERVTKDTWS GRIGCRIHWL 

3885 

A 

3 

996 

GRRRAGPAHSARMYNMMETELKPPGPQQTSGG 

GGGNSTAAAAGGNQKNSPDRVKRPMNAFMVW 

SRGQRRKMAQENPKMHNSEISKRLGAEWKLLSE 

TEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTK 

TLMKKDKYTLPGGLLAPGGNSMASGVGVGAGL 

G AG VN QRMDa Y AHIVIN u W b IS O o Y bftlM^iJt^lAj 

YPQHPGLNAHGAAQMQPMHRYDVSALQYNSM 

TSSQTYMNG/SRPTYSMSYSQQGTPGMAPGS\MG 

SVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMIS 

MYLPGAEVPEPAAPSRLHMSQHYQSGPVPGTAI 

JNO 1 Lrl^oriivl 

3886 

A 

773 

317 

QCTQKAAEGYTQFYYVDVLDGKLACVNKCTKG 
TKSQMNCNLGTCQLQRSGPRCLCPNTNTHWYW 
GETCEFNIAKSLVYGIVGAVMAVLLLALIILIILFS 
LSQ\RKRHRPESEGEADFGLENATNNFG\PTLETV 

JJovj l oi^rllyuvrliM V Ao 1 V 

3887 

A 

3 

466 

VDFRVKTLLVDNKCFVLQLWDTAGQERYHSMT 
RQLLRKADGVVLMYDITSQESFAHVRYWLDCL 
rvnAOQTVTvvTT t t nwT^MT^PFFFBnv^VFAfJOOT 
AQELGVYFGECSAALGHN1LEPVVNLARSLRMQ 
EEGLKDSLVKVAPKRPPKRFGCCS 

38S8 

A 

3412 

3144 

QNIDITNFSSSWNDGLAFCALLHTYLPAHIPYQEL 

NSQDKRRNFMLAFQAAESVGIKSTLDINEMVRT 

ERPDWQNVMLYVTAIYKYFET 


A 

A 

i 


T WT ATT ATT APPTsTPYTRMCT'sFT KFI FlSTnPGT T 

DSSKLCDYENRFNTSKGGELPDRPAGVGVYSAM 

WQLALTLILKIVITIFTFGMKIPSGLFIPSMAVGAI 

AGRLLGVGMEQLAYYHQEWTVFNSWCSQGAD 

CITPGLYAMVGAAACLGGVTRMTVSLVVIMFEL 

TGGLEY1VPLMAAAMTSKWVADALGREGIYDA 
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NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=PhenyJaJaDine, G=Glycine, H=Histidine, 
I=Iso!eucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q^Glutaniine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





HI KLN O Y rr LbAisJibr' AriK 1 LAJVLU V MlsJPRRNDP 

LLTVLTQDSMTVEDVETIISETTYSGFPVVVSRES 

QRLVGFVLRRDLIISIENARKKQDGVVSTSIIYFTE 

HSPPLPPYTPPTLKLRNILDLSPFTVTDLTPMEIVV 

j^irxsjsX/VJL.ivv^v^ij v l oiNvjivi--<JL,wii i tsjvu v i^isjrli/\v<f 

MANQDPDSILFN 

3890 

A 

1 

387 

SWCWTGIFVLGTT^RLEGSWYRSLWGPGFNTT 
TATLGFGAPQAPVGDVALNQPDMCVYRRGRKK 
R VP YTKLQLKELENEY AINKFINKDKRRRI S A AT 
NLSERQVTIWFQNRRVKDKK1VSK1KDTVS 

3891 

A 

2 

2914 

RGGGGDHKMADLSLLQEDLQEDADGFGVDDYS 

SESDVIIIPSALDLAST/QDEMVERPLGRL\DK\YA 

ASENHI*PDKMVAPEFASIPLRE\VCDDERDCIAV 

LGKN*PDWADDSEPT\VRAAELEQVPHIALFLFK 

KTRLSITICFFSKFLLPYCGLDTLADQNNNQVRKT 

SQAALL\ALLEQELIERFDVETKVCPVLIELTAPDS 

NDDVKTEAVAIMCKMAPVMVGKDITERLILPRFC 

EMCCDCRMFHWRKWCAANFGDICSVVGQQAT 

EEMLLPRFFQLCSDNVWGVRKACAECFMAVSC 

ATCQEIRRTKLSALFINLISDPSRWVRQAAFQSLG 

PFISTFANPSSSGQYFKEESKSSEEMSVENNKRTR 

DQEAPEDVQVRPEDTPSDLSVSNSSVILENTMED 

HAAEASGKPLGEISVPLDSSLLCTLSSESHQEAAS 

NEhfDKKPGNYKSMLRPEVGTTSQDSALLDQELY 

NSFHFWRTPLPEIDLDIELEQNSGGKPSPEGPEEE 

SEGPVPSSPNITN4ATRKELEEMIENLEPHIDDPDV 

KAQVEVLSAALRASSLDAHEETISIEKRSDLQDE 

LDINELPNCKINQEDSVPLISDAVENMDSTLHYIH 

NDSDLSNNSSFSPDEERRTKVQDVVPQALLDQY 

LSMTDPSRAQTVDTEIAKHCAYSLPGVALTLGR 

QNWHCLRETYETLASDMQWKVRRTLAFSIHELA 

VILGD\QLTAADLVPIFNGFLK*PSMKSRJGVLKH 

LHDFLKLLfflDKRREYLYQLQEFLVTDNSRNAVR 

"CD A CT A TJ/~\T TT T ¥ C?T VODD HIA A/HVT T> DT A T XIT PAH 

r KAliLAbv^LlLLLfcL Y br KD V Y DYLKrlALNLCAD 

KVSSVRWISYKLVSEMVKKLHAATPPTFGVDLIN 

ELVENFGRCPKWSGRQAFVFVCQTVIEDDCLPM 

DQFAVHLMPHLLTLANDRVPNVRVLLAKTLRQT 

LLEKDYFLASASCHQEAVEQTIMALQMDRDSDV 

KYFASIHPASTBCISEDAMSTASSTY 

3892 

A 

158 

2191 

VPLPAPSGLSGGGSRGAGCKKAPPGRAPAPGLAP 

LRPSEPTMAVPPGHGPFSGFPGPQEHTQVLPDVR 

LLPRRLPLAFRDATSAPLRKLSVDLIKTYKHINEV 

YYAKKKRRAQQAPPQDSSNKKEKKVLNHGYDD 

DNHDYIVRSGERWLERYEIDSLIGKGSFGQWKA 

YDHQTQELVAIK1IKNKKAFLNQAQIELRLLELM 

NQHDTEMKYYIVHLKRHFMFRNXHLCLVFELLS 

YNLYDLLRNTHFRGVSLNLTRKLAQQLCTALLF 

LATPELSIIHCDLKPENILLCNPKRSAIKJVDFGSS 

CQLGQRIYQYIQSRFYRSPEVLLGTPYDLAIDMW 

OJ—VJv^lJL> V Hlvlxl X OEr JbroVJolNri V K^r V^UVJ V JL^v^jVJJNXVl 

VEVLGIPPAAMLDQAPKARKYFERLPGGGWTLR 

RTKELRKDYQGPGTRRLQEVLGVQTGGPGGRRA 

GEPGHSPAD\Y\LRFQDLVLRMLEYEPAARISPLG 

ALQHGFFRRTADEATNTGPAGSSASTSPAPLDTC 

PSSSTASSISSSGGSSGSSSDNRTYRYSNRYCGGP 
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SEQ ID 
NO: 

Method 

Predicted 
beginning 
nucleotide 
location 
corresponding 
to first amino 
acid residue of 
peptide 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutaniic Acid, F=Phenylalanine, G=Glycine, H=.Histidine, 
I=IsoIeucine, K^Lysine, L=Leucine, M^Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possiblc nucleotide deletion, 
\=possible nucleotide insertion 





GPPITDCEMNSPQVPPSQPLRPWAGGDVPHKTH 

QAPASASSLPGTGAQLPPQPRYLGRPPSPTSPPPP 

ELMDVSLVGGPADCSPPHPAPAPQHPAASALRT 

RMTGGRPPLPPPDDPATLGPHLGLRGVPQSTAAS 

S 

3893 

A 

68 

258 

PEEYYPFSPTLQQLFFFLLDSDMGSRPESMGCRK 
NTVPRPASPTEAGTDPQTFLHTWVSECRD 

3894 

A 

1120 

136 

SLPLAPAPAVAGPVALCPAGLCPAQPGMPAGPA 
AASGSHPEVGSVLQRSSQPHWPNPWPGAGHLPP 
PAGPFPYNPPAGPGAAAGLA*SPPRSSPTPCSVGP 
QSCPANASAPPAQPCLAGAPPAASLPPPGPGSVS 
AAPAPGGPAPAEPPLGVPPVPAWLLPDSPPT PGT 

rlAi *VX \J vJ 1 <V1 / VJ_/i. x Lt\J V X XT VI i \ Y» X-ti-tk L/OX I XvX VJ I 

HSGPPPAAVSLPPAAAACPVVVPPPLPHHPPDLES 
PSAAAPNPGCAGGIRHFPPGSPEASSPLRPAAAPA 
LLPLPRPPS*PA^PWKPLHSPVAVAGGSFVAGGSV 
LPAPDLDQPRPSGPPAASPTPGPGVAQPPPGSAVL 
PTVP*APPVSGAAPGRKKEW 

3895 

A 

2 

1347 

FGAVSYRPGNGSCWVKVTASSDLSDL1SCLCPPR 

SLCSSQACVLPVPGPSLLLPQGLHVGCASAGTRW 

PLSCSIDFQRLLAHEEETQKIUIAK^SGMAFTQLT 

FRDVAIEFSQDEWKCLNSTQRTLYRDVMLENYR 

NLVSLDLSRNCVIKELAPQQEGNP/ARSIPHSDIGT 

T*KT*H*RVLLQGNQEKNTRL*LSVER**KKLQQ 

SDYGPKRKSYL*ERPTR*KRYRKQVY*TSA\*LSF 

LPHPHELQQFQAEGKIYECNHVEKSVNHGSSVSP 

POIISSTrKTHVSNKYGTDFrCSSLLTOFOKSCIRF 

X V^XlkJO 1 XIV 1 XX V OiNfV 1 VJ 1 XV* IV/OOLLy X ^LiyXVu V^-ilvAj 

KPYRYIECDKALNHGSHMTVRQVSHSGEKGYKC 
DLCGKVFSQKSNLARHWRVHTGEKPYKCNECD 
RSFS1WSCLALHRRVHTGEKPYKCYECDKVFSR 
NSCLALHQKTHIGEKPYTCKECG QAFS VRSTLTN 
HQVIHSDK 

3896 

A 

202 

498 

MVQSCSAYGCK1^YDKI)KJPVSFHI<^PLTRPSLC 
KE WEAA VRRKNFKPTKYS SICSEHFTPDCFKREC 
NNKLLKENAVPTIFLCTEPHDKKEDLLEPQEQ 

3897 

A 

2 

382 

SHGLSRAPHLSAAPAPALASRPCFSSAPCSQGGG 
GGGPATMIHFILLFSRQGKXRLQKVvTITLPDKER 
KKJTREIVQHLSRGTOTSSFVDWKELKLVYK^ 
SLYFCCAIE\NQDNELLTLENVHR 

3898 

A 

718 

305 

SEQEPLLGDTPGSREWDILETEEITxTCSRWRSITaL 
YLTMFLSSVGFSVVMMSIWPYLQKIDPTADTSFL 
GWVIASYSLGQMVASPIFGLWSNYRPRKEPLIVSI 
LIS VA ANCL YA YLHIPASHNKY YMLVARGLLGIG 

3899 

A 

24 

718 

FRGRPGIPEREGKGNHSFVEVARVIVVDLHSRLG 

GAMAEI^GTAKVDFLKKIEKEIOOKWDTERVFE 

VNASNLEKQTSKGKYFVTFPYPYIV1NGRLHLGHT 

FSLSKCEFAVGYQRLKGKCCLFPFGLHCTGMPIK 

ACADKLKREIELY/GCPPDFPDEEEEEEETSVKTE 

DniKDKAKGl<KSKAA/AKAGSSKYOWGIM 

LSDEEWKPSEAEHWLDYFNALAIQDLKJRMG 

3900 

A 

360 

1 

VPATSSNVSPSSSESSEPDLSSRSSSSDAPSSSPSVP 
SPCSLSLSSPESPLLPTLLSSKSPAGSAGPTCGCPS 
GPGLRATAy^PSRLSSSIAAH/SSSAPETSRPAAARE 
RSPPLHDRESHE 

3901 

A 

193 

345 

GEWAVPPAPGGQGVSIPHGPEPGQGSGVHIAPRQ 
GEGSDRTEPLICPKAAP 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=*Aspartic Acid, 
E=Clutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I— Isoleucine, K=Lysine, L=Leucine, M=Methionine, 
N^Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possib!e nucleotide insertion 

3902 

A 

1188 

1389 

NPAARSAAAREGSPALPPPPVS/SSSGLGLLLPLSP 
PG S HA ANP ALSPRAPHSH YRPRPRCGPRRRPR 

3903 

A 

63 

396 

NNMRNPHLSSNHYLNLARTETVFARMESVKQRI 
LAPGKEGLKNFAGKSLGQIYRVLEKKQDTGET1E 
LTEDGKPL* VPERKAPLCDCTCFGLPRRYUAIMS 
GLGFCISFG 

3904 

A 

732 

1046 

AMSECPLILYIHKHIDTYSQSYLFNDLFYPVYSGG 
RMVTYEHLREV VFGKSEDEH YPLW* VLFGK* Y A 
VAPNALMFIRFM*NCTFVPKLP*VMDLK**LQYK 
SR 

3905 

A 

46 

910 

QPPPPPPPPPSPPPPPFPPARALSHLRLHPDACLFPS 

PFPLPC STMPGMMEKGPELLGKNRS ANG S AKSP 

AGGGGSGASSTNGGLHYSEPESGCSSDDEHDVG 

MRVGAEYQAR1PEFDPGATKYTDKDNGGMLVW 

SPYHSIPDAKLDEYIAIAKEKHGYNVEQALGMLF 

WHKHNIEKSLADLPNFTPFPDEWTVEDKVLFEQ 

AFSFHGKSFHRIQQMLPDKTIASLVKYYYSWKK 

TRSRTSLMDRQARKLANRHNQGDSDDDVEETHP 

MDGNDSDYDPKKEAKKEGMS 

3906 

A 

2 

513 

KVCNCCSQELETSFTYVDKNINLEQRNRSSPSAK 
GHNHPGELGWENPNEWSQEAAISLISEEEDDTSS 
EATSSGKSIDYGFISAILFLVTGILLVIISYIVPREV 
TVDPNTVAAREMERLEKESARLGAHLDRCVIAG 
LCLLTLGGVILSCLLMMSMWKGELYRRNRFAS 

3907 

A 

71 

412 

ILIMSNCLQNFLKITSTRLLCSRLCQQLRSKRKFF 
GTVPISRLHRRVViTGIGLVTPLGVGTHLVWDRLI 
GGESGIVSLVGEEYKSIPCSVAAYVPRGSDEGQF 
NEQNFVSKSD 

3908 

A 

77 

746 

LGTLLGWRAPLFSRCLAFHSPFILLNTPKLVKTAE 

LPPDRNYVLGAHPHGIMCTGFLCNFSTESNGFSQ 

LFPGLRPWLAVLAGLFYLPVYRDYIMSFGLCPVS 

RQSLDFILSQPQLGQAVVIMVGGAHEALYSVPGE 

HCLTLQKRKGFVRLALRHGASLVPVYSFGENDIF 

RLKAFATGSWQHWCQLTFKKLMGFSPCIFWGR 

GLFSATSWGLLPFAVPITTV 

3909 

A 

1 

793 

FRAAGRPAAAMGDIPWGLSSWKASPGKVTEAV 

KEAIDAGYRHFDCAYFYHNEREVGAGIRCKIKE 

GAVRREDLLIATKLWCTCHKKSLVETACRKSLK i 

ALKLNYLDLYLIHWPMGFKPPHPEWIMSCSELSF 

CLSHPRVQDLPLDESNMVIPSDTDFLDTWEAME 

DLVITGLVKNIGVSNFNHEQLERLLNKPGLRFKP 

LTNQIECHPYLTQKNLISFCQSRDVSVTAYRPLG 

GSCEGVDL1DNPVIKRIAKEHGKSPAQIL1 

3910 

A 

202 

705 

FFTMHRKKVDNRIRILIENGVAERQRSLFVVVGD 

RGKDQVVILHHMLSKATVKARPSVLWCYKKEL 

GFSSHRKKRMRQLQKKIKNGTLNIKQDDPFELFI 

AATNIRYCYYKETHKILGNTFGMCVLQDFEALTP 

NLLARTVETVEGGGLVVILLR'IMNSLKQLYTVT 

M 

3911 

A 

3 

723 

AGRGARAAGEGGGPFKSRPRPLPSSRSLPAVGGG 

RYGADKMAAGGAVAAAPECRLLPYALHKWSSF 

SSTYLPENILVDKPNDQSSRWSSESNYPPQYLILK 

LERPAIVQNITFGKYEKTHVCNLKKPKVFGGMN 

EENMTELLSSGLKNDYNKETFTLKHKIDEQMFPC 

RFIKIVPLLSWGPSFNFSIWYVELSGIDDPD1VQPC 
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nucleotide 
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to first amino 

acid residue of 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
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sequence 

Amino acid sequence (A-Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleuctne, K=Lysine, Lr=Lcucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X-Unknown, *=Stop codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion 





LNWYSKYREQEAIRLCLKHFRQHNYTEAFESLQ 
KKT 

3912 

A 

2 

461 

FEKKQLRRPSLFLLGCCSFGIMAPSLWKGLEGIG 

LFALAHAAFSAAQHRSYMRLTEKEDESLPIDIVL 

QTLLAFAVTCYGIVHIAGEFKDMDATSELKNKTF 

DTVRNHPSFYVFNHRGSEYFSGPSDTANSSNQDA 

LSSNTSLKLRKLESLRR 

3913 

A 

362 

20 

APGRPEAKVPERSRESGSRRVRGPLLQLRPGRTS 
RPASGRGRGGAGGSYGK1V1RKPDSKIVLLGDMN 
VGKTSLLQRYMERRFPDTVSTVGGAFYLKQWRS 
YNISIWDTAGEAGAA 

3914 

A 

1 

7545 

PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKPARRLSESLHVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQ VERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEYIIKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKXKDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

ME1DSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 

MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVNSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQTIPCATVENGKKDGIAVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

IEADEGLIIGTHSRNNPLHVGAEASECTVFAAAEE 

GGAVVTEGFAESETFLTSTKEGESGECAVAESED 

RAADLLAVHAVKIEANVNSVVTEEKDDAVTSAG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 


465 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alamne C=Cystcine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalaninc, G=Glycine, H=Histidinc, 
I=Isoleucinc, K=Lysine, L=Leucine, M=Methioninc, 
N=Asparagine, P=Proiine, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, \V=Tryptophan, Y=Tyrosinc, 
X=Unknown, *=Stop codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion 





MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDVVTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLIISTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELPISSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTSIAEECEASVSGVVVESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAMISTSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SBDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSL1AENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAA VNTGAIKADDMPPVQ . 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

FFFKNGFTT APPFSf PfSfiKP^riTAFT ORFPT T V>JP 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRIETAQRQCPETEPHATKEENS 

RDLEELPKTSSETNSTTSRVMEEKDEYSSSETTGE 

KPEQNDDDTIKSQE 

3915 

A 

1 

7545 

PGIRVGITSQTGLSSNLQENCSKLAFISSHGTEKQ 

LQCMPMEGRGRASSSISDLQGKGFEKGTGEKHV 

PGVGSARHSPQASAGGSPWQRGKAQTRWLGKP 

DPGRKRRRGSPQEEGGLRVSAAARLLCSGANRC 

KVLVRQNSTPNTQQPAVHPSTPPSRPLPQAGRCL 

VAPLRPHPDWVAAKTLAKALRAPGKPWRLAAP 

SPLGDLGAPGLPGPSTAPRTLSVEEPGVECNQLC 

LYADVTDPVLCLGQKDPGVEGKHCEKEKISSSK 

ELKHVHAKSEPSKJ>ARRLSESLHVVDENKNESKI 

EREHKRRTSTPVIMEGVQEETDTRDVKRQVERSE 

ICTEEPQKQKSTLKNEKHLKKDDSETPHLKSLLK 

KEVKSSKEKPEREKTPSEDKLSVKHKYKGDCMH 

KTGDETELHSSEKGLKVEENIQKQSQQTKLSSDD 

KTERKSKHRNERKLSVLGKDGKPVSEY1IKTDEN 

VRKENNKKERRLSAEKTKAEHKSRRSSDSKIQK 

DSLGSKQHGITLQRRSESYSEDKCDMDSTNMDS 

NLKPEEVVHKEKRRTKSLLEEKLVLKSKSKTQG 

KQVKVVETELQEGATKQATTPKPDKEKNTEEND 

SEKQRKSKVEDKPFEETGVEPVLETASSSAHSTQ 

KDSSHRAKLPLAKEKYKSDKDSTSTRLERKLSD 

GHKSRSLKHSSKDIKKXDENKSDDKDGKEVDSS 

HEKARGNSSLMEKKLSRRLCENRJRGSLSQEMAK 

GEEKLAANTLSTPSGSSLQRPKKSGDMTLIPEQEP 

MEIDSEPGVENVFEVSKTQDNRNNNSHQDIDSEN 
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SEQ ID 

NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysfeine, D=Aspartic Acid, 
E=Glutamic Acid, ^Phenylalanine, G=Glycine, H=Histidine, 
I=IsoIeucine, K=Lysine, L= Leu cine, M-Methioninc, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Va!ine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /=possiblc nucleotide deletion, 
\-possible nucleotide insertion 





MKQKTSATVQKDELRTCTADSKATAPAYKPGR 

GTGVTsJSNSEKHADHRSTLTKKMHIQSAVSKMNP 

GEKEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNTAAEEHVAQGDATLEHSTNLDSSPSLSS 

VTVVPLRESYDPDVIPLFDKRTVLEGSTASTSPAD 

HSALPNQSLTVRESEVLKTSDSKEGGEGFTVDTP 

AKASITSKRHIPEAHQATLLDGKQGKVIMPLGSK 

LTGVIVENENITKEGGLVDMAKKENDLNAEPNL 

KQT1KATVENGKKDG1AVDHVVGLNTEKYAETV 

KLKHKRSPGKVKDISIDVERRNENSEVDTSAGSG 

SAPSVLHQRNGQTEDVATGPRRAEKTSVATSTE 

GKDKDVTLSPVKAGPATTTSSETRQSEVALPCTS 

IEADEGLIIGTHSRNNPLHVGAEASECTVFAAAEE 

GGAVVTEGFAESETFLTSTKEGESGECAVAESED 

RA ADLL A VH A VKIE AN VN S V VTEEKDD A VTS AG 

SEEKCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSISSEEVDGSQGNMMRMGPKKETEGTV 

TCTGAEGRSDNFVICSVTGAGPREERMVTGAGV 

VLGDNDAPPGTSASQEGDGSVNDGTEGESAVTS 

TGITEDGEGPASCTGSEDSSEGFAISSESEENGESA 

MDSTVAKEGTNVPLVAAGPCDDEGIVTSTGAKE 

EDEEGEDVVTSTGRGNEIGHASTCTGLGEESEGV 

LICESAEGDSQIGTVVEHVEAEAGAAIMNANENN 

VDSMSGTEKGSKDTDICSSAKGIVESSVTSAVSG 

KDEVTPVPGGCEGPMTSAASDQSDSQLEKVEDT 

TISTGLVGGSYDVLVSGEVPECEVAHTSPSEKED 

EDIITSVENEECDGLMATTASGDITNQNSLAGGK 

NQGKVLnSTSTTNDYTPQVSAITDVEGGLSDALR 

TEENMEGTRVTTEEFEAPMPSAVSGDDSQLTASR 

SEEKDECAMISTSIGEEFELP1SSATTIKCAESLQP 

VAAAVEERATGPVLISTADFEGPMPSAPPEAESP 

LASTSKEEKDECALISTS1AEECEASVSGVVVESE 

NERAGTVMEEKDGSGIISTSSVEDCEGPVSSAVP 

QEEGDPSVTPAEEMGDTAM1STSTSEGCEAVMIG 

AVLQDEDRLTITRVEDLSDAAIISTSTAECMPISA 

SBDRHEENQLTADNPEGNGDLSATEVSKHKVPM 

PSLIAENNCRCPGPVRGGKEPGPVLAVSTEEGHN 

GPSVHKPSAGQGHPSAVCAEKEEKHGKECPEIGP 

FAGRGQKESTLHLINAEEKNVLLNSLQKEDKSPE 

TGTAGGSSTASYSAGRGLEGNANSPAHLRGPEQ 

TSGQTAKDSSVSSIRYLAAVNTGAIKADDMPPVQ 

GTVAEHSFLPAEQQGSEDNLKTSTTKCITGQESKI 

APSHTMIPPATYSVALLAPKCEQDLTIKNDYSGK 

WTDQASAEKTGDDNSTRKSFPEEGDIMVTVSSE 

ENVCDIGNEESPLNVLGGLKLKANLKMEAYVPS 

EEEKNGEILAPPESLCGGKPSGIAELQREPLLVNE 

SLNVENSGFRTNEEIHSESYNKGEISSGRKDNAE 

AISGHSVEADPKEVEEEERHMPKRKRKQHYLSSE 

DEPDDNPDVLDSRJETAQRQCPETEPHATKEENS 

ivULrtioLr JV 1 OOJD 1 IN o 1 1 oK V lVliiErjSJJli I oooil A 1 OH 

KPEQNDDDTIKSQE 

3916 

A 

2 

773 

GPFGVLWPSAKPGPVTAVEARPPDASDPEGLRG 
GSPAPLLAPGPLDPSGRLHPAVSMMSYLKQPPYG 
MNGLGLAGPAMDLLHPSVGYPATPRKQRRERTT 
FTRSQLDVLEALFAKTRYPDIFMREEVALKINLPE 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine T H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, M=iVIethionine, 
N=Asparagine, P=Proline, Q^Glutamine, R^Argintne, S=Scrine, 
T=Threonine, V=Valtne, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon,/=possib!e nucleotide deletion, 
V=possible nucleotide insertion 





SRVQVWFKNRRAKCRQQQQSGSGTKSRPAKKK 
SSPVRESSGSESSGQFTPPAVSSSASSSSSASSSSA 
NPAAAAAAGLWAKLPCPLHIFSLCVFIEENTILV 
SGSWARDIRSVEETDKSGYR 

3917 

A 

2 

776 

RN1PGRRFRPPGLRRLLKGPHMPREPRGYRTRVP 

ALRELVPSSHAGSGASEHCQNNRQGSRQHRASR 

NVOAGGALAPPRHLCGLCSRLHFI KPDT WRAA 

PSRAGASVMALRKELLKS1WYAFTALDVEKSGK 

VSKSQLRVLSHNLYTVLHIPHDPVALEEHFRDDD 

DGPVSSQGYMPYLNKYILDKVEEGAFVKEHFDE 

FNFLSEDKYPLIMDPDEGEYLLKRYS 

3918 

A 

10 

318 

WQDLVCLGGSRAQEQKPLQQLWNAILLVAMLL 

PTGT VVOAOROASRO^ORFT nnOVHT FKRRVV 

RRLASLKTRRCRLSRAAQGLPDPGAETCAVCLD 
YFCNKQ 

3919 

A 

1 

204 

RVLTA1NHTLKENLRKFYKGKKDKPLDLRPKKT 
RAMRRRLNIVlflEENLKTKKQHRKERLYPLRXYA 
AKA 

3920 

A 

1 

654 

RCCRSFVAPLQEKVVFGLFFLGAILCLSFSWLFHT 

V I OrlollO V oi\X/r oJvUlJ X oOlALLlIVlvjor VrWLY Y 

SFYCNPQPCFIYLIVICVLGIAAI1VSQWDMFATPQ 

YRGVRAGVFLGLGLSGIIPTLHYV1SEGFLKAATI 

GQIGWLMLMASLYITGAALYAAR1PERFFPGKCD 

IWFHSHQLFHIFVVAGAFVHFHGVSNLQEFRFMI 

GGGCSEEDAL 

3921 

A 

1587 

452 

LERDGCGGEEGGSVRSGAGPDSDPRGASSPPAG 

HRGTAASPRPVAAPSRTPAPPHTRARASPGLPSG 

PAWRRVQWFSRVSGQVSTLMBCATVLMRQPGRV 

QEIVGALRKGGGDRLQVISDFDMTLSRFAYNGK 

RCPSSYN1LDNSKIISEECRKELTALLHHYYPIEID 

PHRTVKEKJ.PHMVEWWTKAHNLLCQQKIQKFQ1 

AOVVRF9NAMT R FGVRTTFFNTT VTTNTMTPT PTF^A 

GIGDILEEIIRQMKVFHPN1HIVSNYMDFNEDGFL 
OGFKGOT THTYNKN^SArFNrnYFOOl FGKTMV 
ELLGDSIGDLTMADGVPGVQNILKIGFLNDKVEE 
RRERYMDSYDIVLEKDETLDVVNGLLQHILCQG 
VQLEMQGP 

3922 

A 

2 

164 

GKIYQRAFGGHSLKFGKGVQAHGCCCVADRTG 
HSILHTSYGRERPAPVHLRQDT 

3923 

A 

2 | 

3258 

EHATEL\YAKLGTRRRHREVTVFVPTWQLKBCNR 

RVRESHFLTKLHSLKMLSITPSQLENGKKITTYD 

YRFMVKLAEETDGI1VTNEQIHILMNSSKKLMVK 

D1nI.LPFTFAGNLFMVPDDPLGRDGPTLDEFLKKP 

NRLDTDIGNFLKVAVKTLPPSSASVTELSDDADSG 

PLESLPNMEEVREEKEERQDEEQRQGQGTQKAA 

EEDDLDSSLASVFRVECPSLSEEILRCLSLHDPPD 

GALDEDLLPGAASPYLGIPWDGKAPCQQVLAHL 

AQLTIPSNFTALSFFMGFMDSHRDAIPDYEALVG 

PLHSLLKOKPDWOWDOEffiEAFLALKJRALVSAL 

CLMAPNSQLPFRLEVTVSHVALTAILHQEHSGRK 

HPIAYTSKPLLPDEESQGPQSGGDSPYAVAWALK 

HFSRCIGDTPVVLDLSYASRTTADPEVREGRRVS 

KAWLIRWSLLVQDKGKRALELALLQGLLGENRL 

LTPAASMPRFFQVLPPFSDLSTFVCIHMSGYCFYR 
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SEQ ID 

NO: 

Method 

Predicted 

hpoinninc* 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 

niirlfotiHe 

location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic Acid, 
F=r3lt!tamir 4cid ^^Phenylalanine G=Glvrine H=HUtidine 
I=Isoleucine, K=Lysioe, L^Leucinc, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutamine, R=Arginine, S=Scrine, 
T=Threonine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion 





EDEWCAGFGLYVLSPTSPPVSLSFSCSPYTPTYA 

HLAAVACGLERFGQSPLPVVFLTHCNWIFSLLWE 

LLPLWRARGFLSSDGAPLPHPSLLSYIISLTSGLSS 

LPFIYRTSYRGSLFAVTVDTLAKQGAQGGGQWW 

SLPKDVPAPTVSPHAMGKRPNLLALQLSDSTLAD 

IIARLQAGQKLSGSSPFSSAFNSLSLDKESGLLMF 

KGDKKPRVWVVPTQLRRDLIFSVHD1PLGAHQR 

PEETYKKLRLLGWWPGMQEHVKDYCRSCLFCIP 

RNLIGSELKVIESPWPLRSTAPWSNLQIEWGPVT 

ISEEGHKHVLIVADPNTRWVEAFPLKPYTHTAVA 

QVLLQHVFARWGVPVRLEAAQGPQFARHVLVS 

CGLALGAQVASLSRDLQFPCLTSSGAYWEFKRA 

LKEFIFLHGKKWAASLPLLHLAFRASSTDATPFK 

VLTGGESRLTEPLWWEMSSANIEGLKMDVFLLQ 

LVGELLELHWRVADKASEKAENRRFKRESQEKE 

WNVGDQVLLLSLPRNGSSAKWVGPFYIGDRLSL 

SLYRIWGFPTPEKLGCIYPSSLMKAFAKSGTPLSF 

KVLEQ 

3924 

A 

1 

1826 

MGSVTVRYFCYGCLFTSATWTVLLFVYFNFSEV . 

TQPLKNVPVKGSGPHGPSPKKFYPRFTRGPSRVL 

EPQFKANK1DDVIDSRVEDPEEGHLKFSSELGM1F 

NERDQELRDLGYQKHAFNMLISDRLGYHRDVPD 

TRNAACKEKFYPPDLPAASVVICFYNEAFSALLR 

TVHSVIDRTPAHLLHEIILVDDDSDFDDLKGELDE 

YVQKYLPGKIKVIRNTKREGLIRGRMIGAAHATG 

EVLVFLDSHCEVNVMWLQPLLAAIREDRHTVGC 

PVIDIISADTLAYSSSPVVRGGFNWGLHFKWDLV 

PLSELGRAEGATAPIKSPTMAGGLFAMNRQYFH 

ELGQYDSGMD1WGGENLEISFRIWMCGGKLF1IP 

CSRVGfflFRKRRPYGSPEGQDTMTHNSLRLAHV 

WLDEYKEQYFSLRPDLKTKSYGNISERVELRKKL 

GCKSFKWYLDNVYPEMQISGSHAKPQQPIFVNR 

GPBCRPKVLQRGRLYHLQTNKCLVAQGRPSQKG 

GLVVLKACDYSDPNQIWIYNEEHELVLNSLLCLD 

MSETRSSDPPRLMKCHGSGGSQQWTFGKNNRLY 

QVSVGQCLRAVDPLGQKGSVAMAICDGSSSQQ 

WHLEG 

3925 

A 

5386 

2897 

VRWNSKTECYLSIQTQENFPANLNELVNCIV1SSL 

VTTQRKLKAMSLLGSRNQLARAVLNPNPMDFCT 

KI)LLTTTSERJ1AYLRDFNEDQKKAIETAYAMVK 

HSPSVAKICLfflGPPGTGKSKTIVGLLYRLLTENQ 

RKGHSDENSNAKIKQNRVLVCAPSNAAVDELM 

KKIILEFKEKCKDKKNPLGNCGDINLVRLGPEKS1 

NSEVLKFSLDSQVNHRMKKELPSHVQAMHKRK 

EFLDYQLDELSRQRALCRGGREIQRQELDENISK 

VSKERQELASKIKEVQGRPQKTQSIIILESHIICCT 

LSTSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEI 

ETLTPLIHRCNKLILVGDPKQLPPTVISMKAQEYG 

YDQSMMARFCRLLEENVEHNMISRLP1LQLTVQ 

YRMHPDICLFPSNYVYNRNLKTNRQTEAIRCSSD 

WPFQPYLVFDVGDGSERRDNDSYINVQEIKLVM 

EIIKLIKDKilKDVSFRNIGIITHYKAQKTMIQKDL 

DKEFDRKGPAEVDTVDAFQGRQKDCVIVTCVRA 

NSIQGSIGFLASLQRLNVTITRAKYSLFILGHLRTL 

MENQHWNQLIQDAQKRGAIIKTCDKNYRHDAV 
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SEQID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A^AIanine C=Cysteine, D=Aspartic Acid, 
E=Glutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I-Isofeucine, K=Lysine, L=Lcucine, M=Methionine, 
N=Asparagine, P=Proline, Q=GIutaminc, R^Arginine, S=Serine, 
T=Thrconine, V=Valine, W=Tryptophan, Y=Tyrosine, 
X=Unknown, *=Stop codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion 





KILKLKPVLQRSLTHPPTIAPEGSRPQGGLPSSKL 
nsnFAKT^VAA^I YHTPSD^KFTTl TVT^KDPPRP 

PVHDQLQDPRLLKRMGffiVKGGIFLWDPQPSSPQ 
HPG ATPPTGEPGFPVVHODLSHVOOPA AW A Al 

X IX VJ tx. Ill 1 VJ 1>1 VJ Ik V V 1 1 V/ VJ 1 A V VVVy 1 JTx/i. V V r-\ ■ . 

SSHKPPVRGEPPAASPEASTCQSKCDDPEEELCH 
RMARAFSEGEQEKCGSETHHTRl^SRWDKRTL 
EQEDSSSKKRKLL 

3926 

A 

99 

284 

MPREDRATWKSNYFLKIIQLLDDYPKilFIVGANN 
VGSKQMQQIRMSLRGKAVVLMGKNTMMR 


A 


9 

A HI 1 MT NF AI YTDT T \YT TST PFT TTTWA^OPMWT 

FGDFMCKFIRFSFHFNLYSSILFLTCFSIFRYCVIIH 
PMSCFSIHKTRCAVVACAVVWIISLVAVIPMTFLI 
TSTNRTNRSACLDLTSSDELNTIKWYNLILTAMLL 
CLPLVIVTLCYTTnHTLTHGHA>ADSCLKQKARR 
LTILLL 

3928 

A 

1 

1516 

GEEAVGGGAEGGGFGVGAQGRAGGRGVEAGR 

IV1TU.SKTLVDMDMADYSAALDPAYTTLEFENVQ 

VLTMGNDTSPSEGTNLNAPNSLGVSALCAICGDR 

ATGKJHYGASSCDGCKGFFRRSVRKNHIV^ 

RQCVVDKDKR^IQCRYCRJLKKCFRAG^1 J <JK£AV 

QNERDRISTRRSSYEDSSLPSINALLQAEVLSRQIT 

SPVSG1KGDIRAKXIASIADVCESMKEQLLVLVE 

WAKY1PGFCELPLDDQGALLRAHAGEHLLLGAT 

KRSMVFKDVLLLGNDYIVPRHCPELAEMSRVSIR 

ILDELVLPFQELQIDDNEYAYLKAIIFFDPDAKGL 

olJr vjiVliSJvi^Kov^ vy V oLciJ i llNXJlVv^ I IJoJlvOIaX vJJc, 

LLLLLPTLQSITWQM1EQIQFIKXFGMAKIDNLLQ 

EMLLGGSPSDAPHAHHPLHPI^MQEHMGTNVIV 

ANTMPTHLSNGQMCEWPRPRGQAATPETPQPSP 

PGASGSEPYKLLPGAVATIVKPLSAIPQPTITKQE 

VI 

3929 

A 

1 

2782 

RVLSLESPLEKDPRVLGAQSVPRGRALKGLSPLG 

LDSAFRLFPDPRAGPWNTAVLSSGMEPETALWG 

PDLQGPEQSPNDAHRGAESENEEESPRQESSGEEI 

IMGDPAQSPESKDSTEMSLERSSQDPSVPQNPPTP 

LGHSNPLDHQIPLDPPAPEVVPTPSDWTKACEAS 

WQWGALTTWNSPPVVPANEPSLRELVQGRPAG 

AEKPYICNECGKSFSQWSKLLRHQRIHTGERPNT 

CSECGKSFTQSSHLVQHQRTHTGEKPYKCPDCG 

KCFSWSSNLVQHQRTHTGEKPYKCTECEKAFTQ 

STNLIKHQRSHTGEKPYKCGECRRAFYRSSDLIQ 

HQATHTGEKPYKCPECGKRFGQNHNLLKHQKIH 

AGEKPYRCTECGKSFIQSSELTQHQRTHTGEKPY 

ECLECGKSFGHSSTLIKHQRTHLREDPFKCPVCG 

KTFTLSATLLRHQRTHTGERPYKCPECGKSFSVS 

SNLINHQRJHRGERPY1CADCGKSFIMSSTLIRHQ 

RIHTGEKPYKCSDCGKSFIRSSHLIQHRRTHTGEK 

PYKCPECGKSFSQSSNLITHVRTHMDENLFVCSD 

CGKAFLEAHELEQl^VIHERGKTPARRAQGDSL 

LGLGDPSLLTPPPGAKJPHKCLVCGKGFNDEG1PM 

QHQRffllGENPYKNADGLIAFLAAPKPPQLRSPRL 

PFRGNSYPGAAEGRAEAPGQPLKPPEGQEGFSQR 

RGLLSSKTYICSHCGESFLDRSVLLQHQLTHGNE 

KPFLFPDYRJGLGEGAGPSPFLSGKPFKCPECKQS 

FGLSSELLLHQKVHAGGKSSHKSPELGKSSSVLL 
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SEQ ID 
NO: 

Method 

Predicted 

beginning 

nucleotide 

location 

corresponding 

to first amino 

acid residue of 

peptide 

sequence 

Predicted end 
nucleotide 
iocaiion 
corresponding 
to last amino 
acid residue of 
peptide 
sequence 

Amino acid sequence (A=AIanine C=Cysteine, D=Aspartic Acid, 
E=GIutamic Acid, F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isuleucine, K-Lysine, L=Leucinc, M=Mcthionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, S=Serine, 
T^Threonine, V= Valine, W=Tryptophan, Y=Tyrosine, 
X^Unknown, *=Stop codon, /=possible nucleotide deletion, 
V-=possible nucleotide insertion 





EHLRSPLGARPYRCSDCRASFLDRVALTRHQETH 
TOFKPPNPFDPPPEAVTLSTDOEGEGETPTPTESS 
SHGEGQNPKTLVEEKPYLCPECGAGFTEVAALLL 
HRSCHPGVSL 

-JQ1A 

A 

A 


Oil 
J. to 

KTOFTRTYTSFHIFFPFT OGFGNLPICMAKTDLSLS 
HQPDKKGVPSDFILPISDVRASIGAGFIYPLVGTG 
SRESPLWL 

3931 

A 

A 

1 £ 


KRRDFI SPWFAFTVLGEARGDOVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 

3932 

A 

16 

305 

KJIRDFLSCWPAFTVLGEARGDQVDWSKLYRDT 
GLVKMSRKPRASSPFSNNHPSTPKRRGRGKHPLI 
PGPEALSKFPRQPIREKGPVKEVPGTKGSP 

3933 

A 

1 

1546 

STHASEHWDSALQLAKHLAPDQIPFISKEYAIQLE 

FAGDYVNALAHYEKGITGDNKEHDEACLAGVA 

QMSIRMGDIRRGVKQALKJIPSRVLKRDCGAILE 

NMKQFSEAAQLYEKGLYYDKAASVY1RSKNWA 

KVGDLLPHVSSPKIHLQYAKAKEADGRYKEAVV 

AYENAKQWQSVnUYLDHLNNPEKAVNIVRETQ 

SLDGAKMVARFFLQLGDYGSAIQFLVMSKCNNE 

AFTLAQQ1WKMEIYADIIGSEDTTNEDYQSIALY 

FEGEKRYLQAGKPFLLCGQYSRALKHFLKCPSSE 

DNVAIEMAIETVGQAKDELLTNQLIDHLLGEND 

r*\A'D\YV*AW1 CT?T VA>f AT VfWRF A AOTATTT AftFF 

QSAGNYRNAHDVLFSMYAELKSQKIKIPSEMAT 

NLMILHSYILVKIHVK^GDHMKGARMLIRVANN 

ISKFPSHIVPILTSTVIECHRAGLKNSAFSFAAML 

MRPEYTlSKIDAKYKKXIEGMVRRPDISEffiEATTP 

CPFCKFLLPESELL 

3934 

A 

334 

1268 

PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCLSVS 
GIGGFLVSLSSRMKLQTLAVSVTALKFWSAYVP 
CQTQDRDALRJLTLEQIDLIRJvMCASYSELELVTS 
AKALNDTQKLACLIGVEGGHSLDNSLSILRTFYM 

I_aj VixiLrlLrlrll V-/1N lrw /\DoorVJ\.vj v nor i ininiovjjj 

TDFGEKVVAElvl^RLGlvlMVDLSHVSDAVARRAL 

EVSQAPV1FSHSAARGVCNSARNVPDDILQLLEE 

ERWAFVMVSLFHGELIQWQPIRPMCSTVADHFD 

HIKAWGSKFIGIGGDYDGAGKYRKKTTCKAPW 

RTSSRMSS 

3935 

A 

1 

883 

HETTPAVVQSVLLERGWNKFDKQEQNAEDWNL 
YWRTSSFRMTEHNSVKPWQQLNHHPGTTKLTR 
KDCLAKHLKHMRJxMYGTSLYQFIPLTFVMPNDY 

TlfFV A F YFOFTv OMT nTTCHSYWTCKPAELSRGRG 

1LITSDFKDFITODMYIVQKYISNPLLIGRYKCDLR 

IYVCVTGFKPLTIYVYQEGLVRFATEKFDLSNLQ 

NNYAHLTNSSINKSGASYEKIKEVIGHGCKWTLS 

RFFSYLRSWDVDDLLLWKKIHRMVILTILAIAPS 

VPFAANCFELFGFDILIDDNEFHRTG 

3936 

A 

203 

441 

HLAHSLGPLPKHYQYCVRYLYYQVTKDVIKEFA 
DDGVKYLELRSTPRRENATGMTKKTYVESBLEGI 
KQSKQENLDIDV 
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TABLE 7 


SEQ ID NO: 

Position of end of 
Signal in Amino Acid 
Sequence 

MaxS fMAXIMUM 
SCORE) 

MeanSl nVTean Srnrp\ 

1 

19 

0.930 

0.680 

2 

24 

0.964 

0.863 

3 

21 

0.990 

0.901 

4 

19 

0.981 

0.942 

5 

22 

0.991 

0.928 

6 

21 

0.956 

0.843 

8 

22 

0.913 

0.718 

9 

17 

0.997 

0.969 

11 

19 

0.930 

0.680 

13 

36 

0.983 

0.863 

14 

28 

0.935 

0.839 

15 

21 

0.997 

0.955 

16 

16 

0.983 

0.944 

17 

18 

0.989 

0.884 

19 

49 

0.996 

0.719 

20 

28 

0.972 

0.920 

21 

23 

0.954 

0.905 

22 

46 

0.955 

0.568 

23 

26 

0.942 

0.654 

24 

19 

0.979 

0.941 

25 

34 

0.884 

0.565 

26 

33 

0.934 

0.584 

27 

17 

0.975 

0.914 

28 

18 

0.980 

0.934 

29 

23 

0.928 

0.718 

30 

26 

0.978 

0.885 

32 

20 

0.946 

0.719 

33 

29 

0.933 

0.671 

35 

25 

0.996 

0.920 

36 

26 

0.903 

0.579 

40 

19 

0.981 

0.942 

47 

25 

0.971 

0.909 

53 

22 

0.991 

0.928 

55 

24 

0.960 

0.808 

60 

19 

0.986 

0.967 

78 

22 

0.913 

0.718 

86 

20 

0.883 

0.555 

87 

24 

0.982 

0.889 

88 

17 

0.997 

0.969 

115 

19 

0.930 

0.680 

134 

36 

0.983 

0.863 

136 

17 

0.913 

0.696 

137 

19 

0.958 

0.905 

140 

28 

0.935 

0.839 

143 

32 

0.914 

0.740 

153 

21 

0.997 

0.955 

154 

25 

0.913 

0.583 

155 

29 

0.972 

0.857 

169 

30 

0.977 

0.817 

170 

30 

0.977 

0.819 

171 

30 

0.977 

0.819 

175 

47 

0.926 

0.606 

176 

30 

0.968 

0.872 

177 

22 

0.957 

0.791 

192 

43 

0.930 

0.678 


472 


WO 01/57190 


PCT/USO 1/04098 


SEQ ED NO: 

Position of end of 
Signal in Amino Acid 
Sequence 

MaxS (MAXIMUM 
SCORE) 

MeanS (Mean Score) 

195 

19 

0.956 

0.860 

202 

.21 

0.982 

0.871 

203 

24 

0.957 

0.870 

207 

23 

0.954 

0.905 

224 

46 

0.955 

0.568 

225 

26 

0.942 

0.654 

228 

45 

0.961 

0.839 

231 

28 

0.994 

0.937 

232 

28 

0.993 

0.896 

234 

19 

0.979 

0.942 

235 

19 

0.979 

0.941 

238 

20 

0.987 

0.943 

244 

23 

0.929 

0.683 

250 

34 

0.884 

0.565 

256 

33 

0.934 

0.584 

258 

25 

0.934 

0.729 

259 

22 

0.969 

0.871 

264 

19 

0.952 

0.753 

265 

17 

0.975 

0.914 

266 

17 

0.975 

0.914 

271 

23 

0.974 

0.884 

274 

13 

0.971 

0.834 

275 

18 

0.980 

0.934 

278 

32 

0.958 

0.668 

280 

24 

0.966 

0.881 

281 

24 

0.966 

0.881 

286 

23 

0.928 

0.718 

291 

35 

0.991 

0.824 

293 

27 

0.956 

0.806 

294 

23 

0.952 

0.827 

301 

26 

0.978 

0.885 

316 

20 

0.946 

0.719 

320 

28 

0.978 

0.726 

327 

29 

0.933 

0.671 

331 

48 

0.903 

0.571 

345 

25 

0.996 

0.920 

349 

26 

0.903 

0.579 

351 

24 

0.951 

0.876 

352 

18 

0.944 

0.716 

353 

32 

0.992 

0.854 

354 

27 

0.945 

0.817 

355 

16 

0.922 

0.716 

356 

13 

0.959 

0.818 

357 

23 

0.986 

0.878 

358 

19 

0.904 

0.671 

359 

16 

0.988 

0.951 

360 

15 

0.981 

0.938 

361 

18 

0.944 

0.716 

362 

21 

0.984 

0.869 

363 

40 

0.979 

0.813 

364 

18 

0.883 

0.693 

365 

22 

0.962 

0.908 

366 

22 

0.961 

0.827 

367 

44 

0.941 

0.624 

368 

20 

0.952 

0.791 

369 

22 

0.949 

0.840 

370 

28 

0.957 

0.682 
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SEQ ID NO: 

Position of end of 
Signal in Amino Acid 
Sequence 

MaxS (MAXIMUM 
SCORE) 

MeanS (Mean Score) 

372 

28 

0.974 

0.894 

373 

19 

0.972 

0.947 

374 

29 

0.968 

0.785 

375 

19 

0.949 

0.897 

377 

23 

0.962 

0.910 

378 

31 

0.974 

0.895 

379 

26 

0.969 

0.939 

380 

27 

0.945 

0.817 

383 

27 

0.945 

0.817 

384 

25 

0.992 

0.877 

385 

32 

0.983 

0.825 

386 

44 

0.924 

0.564 

387' 

26 

0.971 

0.894 

388 

19 

0.989 

0.862 

389 

24 

0.990 

0.947 

390 

34 

0.942 

0.635 

391 

16 

0.922 

0.716 

394 

19 

0.987 

0.970 

398 

36 

0.992 

0.866 

404 

13 

0.959 

0.818 

417 

23 

0.986 

0.878 

421 

19 

0.904 

0.671 

425 

28 

0.971 

0.717 

431 

16 

0.988 

0.951 

452 

18 

0.944 

0.716 

459 

21 

0.991 

0.902 

468 

21 

0.984 

0.869 

478 

40 

0.979 

0.813 

486 

18 

0.883 

0.693 

499 

22 

0.962 

0.908 

501 

19 

0.962 

0.877 

514 

44 

0.941 

0.624 

529 

20 

0.952 

0.791 

533 

39 

0.914 

0.719 

548 

28 

0.957 

0.682 

561 

28 

0.974 

0.894 

562 

28 

0.9J4 

0.893 

564 

18 

0.949 

0.806 

576 

19 

0.972 

0.947 

584 

29 

0.968 

0.785 

585 

28 

0.973 

0.810 

591 

19 

0.949 

0.897 

592 

24 

0.991 

0.954 

594 

20 

0.985 

0.959 

595 

20 

0.985 

0.959 

612 

23 

0.962 

0.910 

619 

31 

0.974 

0.895 

621 

15 

0.959 

0.795 

633 

26 

0.969 

0.939 

640 

20 

0.949 

0.842 

645 

25 

0.911 

0.759 

684 

25 

0.992 

0.877 

691 

32 

0.983 

0.825 

698 

44 

0.924 

0.564 

700 

19 

0.982 

0.941 

710 

26 

0.971 

0.894 

714 

23 

0.965 

0.907 
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Position of end of 

Qiori**! in A m inn A 
0'gIl«I /\III1IIIJ rtLiu 


IVT^nS flVTean Scored 

71R 

19 

0.989 

0.862 

79 S 

21 

0.976 

0.851 

77 8 

33 

0.961 

0.895 

714 

25 

0.963 

0.660 

741 

/HI 


0.942 

0.635 

744 

/ HH 

19 

i y 

0.959 

0.924 

747 

16 

0.922 

0.716 

7S6 

26 

0.973 

0.864 

767 

22 

0.986 

0.943 

768 

27 

0.916 

0.758 

769 

19 

0.987 

0.970 

770 

22 

0.981 

0.933 

771 

34 

0.993 

0.893 

771 
1 l o 

20 

0.968 

0.939 

lid 

1 /H 

21 

0.971 

0.945 

778 
/ / o 

22 

0.986 

0.943 

770 

19 

0.973 

0.846 

7R1 

/O 1 

9T 

0.950 

0.857 

7RS 

/ O J 

97 

0.916 

0.758 

7R/i 

27 

0.916 

0.758 

7RR 

9? 

0.981 

0.933 

7QT 

9? 

0.986 

0.803 

704 


0.892 

0.654 

707 

77 

0.965 

0.847 

Rin 

99 

0.981 

0.933 

R91 

14 

0 993 

yj.y y~* 

0.893 

R9*; 

17 

0 962 

0.778 

R17 


\j.y\jo 

0.939 

R44 

9S 

0 984 

0.951 

R4*\ 

17 

0 919 

0.706 

R4£ 
OHO 

91 

0.971 

0.945 

R47 

21 

0.971 

0.945 

ROD 

22 

0.986 

0.943 

R01 

0"J 

94 

0.971 

0.865 

R04 

24 

0.971 

0.865 


32 

0.973 

0.846 

RO0 

077 

31 

0.982 

0.817 

097 

15 

0.882 

0.706 

094 

21 

0.975 

0.948 

925 

21 

0.927 

0.661 

933 

20 

0.967 

0.906 

960 

20 

0.967 

0.906 

967 

38 

0.970 

0.784 

968 

47 

0.970 

0.557 

972 

36 

0.945 

0.775 


TABLE 8 


SEQ 
ID 

NO: 

Method 

Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 

Amino acid sequence (A=Alanine OCysteine, D=Aspartic 
Acid, E=Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q=Glutamine, 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possib!e nucleotide 
insertion 

3955 

A 

235 

1272 

GPREVLAASSLADGSEEQVMAVALVRERDLSFPG 
VGDAVVNPmWHLPAQPEMLYEGGEGRMETLK 
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SEQ 
ID 

NO: 

Method 

Predicted 
beginning 
nuc)eotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic 
Acid, E=Glutamic Acid, F=Phenylalanine, G=Glycine, 
H == Histidine, I=Iso leucine, K=Lysine, L=Leucine T 
M=Methionine, N=Asparagine, P-Proline, Q=Glutamine, 
R=Arginine, S=Serine, T=Threonine, V=Valine, 
YV=Tryptophan, V=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible nucleotide 
insertion 





DKTLQELEELQNDSEAIDQLALESPEVQDLQLERE 

MALATNRSLAERNLEFQGPLEISRSNLSDRYQELR 

KLVERCQEQKAKLEKFSSALQPGTLLDLLQVEGM 

KIEEESEAMAEKFLEGEVPLETFLENFSSMRMLSH 

LRRVRVEKLQEVVRKPRASQELAGDAPPPRSPPP 

V/PPSPPGNTPCG * RAAA ATISHASLPF ALQP1PQPA 

CGPHCPWSPATGPFPSSVPALLLQRASGPHLPGSP 

AWTQGCCGLLLVPTEEHAAPPYGFPPPPGPAWPG 

Y 

3956 

A 

821 

385 

SICADRTERVGIFFYIPAGTTDEADVTHP*EGHSYL 

SNHAG1QRSSRP/SHYQGE/WHDNCFTADELQLLT 

YQLCHTYVRCTRSVSIPAPAYYAHLVAFRARYHL 

VDKEHDSAEGSHVSGQSNGRDPQALAKAVQIHQ 

DTLRTMYFA 

3957 

A 

4621 

240 

ELISTFKLLLEKKRSEVMKMKKRYEVGLEKLDSA 

SSQVATMQMELEALHPQLKVASKEVDEMMIM1E 

KESVEVAKTEK1VKADETIANEQAMASKA1KDEC 

DADLAGALPILESALAALDTLTAQDITWKSMKSP 

PAGVKLVMEAICILKGIKADKIPDPTGSGKKIEDF 

WGPAKRLLGDMRFLQSLHEYDKDNIPPAYMN1IR 

KN Y1PNPDF VPEKIRN ASTA A EGLCK WV I AMD S Y 

DKVAKIVAPKK1KLAAAEGELKIAMDGLRKKQA 

ALKEVQDKLARLQDTLELNKQKKADLENQVDLC 

SKKLERAEQLIGGLGGEKTRWSHTALELGQLYIN 

LTGD1LISSGVVAYLGAFTSTYRQNQTKEWTTLCK 

GRDIPCSDDCSLMGTLGEAVT1RTWNIAGLPSDSF 

SIDNGIIIMNARRWPLMIDPQSQANKWIKNMEKA 

NSLYVIKLSEPDYVRTLENCIQFGTPVLLENVGEE 

LDPILEPLLLKQTFKQGGSTCIRLGDSTIEYAPDFR 

FYITTKLRNPHYLPETSVKVTLLNFMITPEGMQDQ 

LLGIVVAQERPDLEEEKQALILQGAENKRQLKEIE 

DKILEVLSSSEGNILEDETAKILSSSKALANEISQK 

QEVAEETEKKIDTTRMGYRPIAIHSSILFFSLADLA 

NIEPMYQYSLTWFINLFILSIENSEKSEILAKRLQIL 

KDHFITSLYVNVCRSLFEKI)KLLFSFCLTINLLLH 

ERAINKAEWRFLLTGGIGLDNPYANPCTWLPQKS 

WDEICRLDDLPAFKTIRREFMRLKDGWKKVYDSL 

EPHHEVFPEEWEDKANEFQRMLIIRCLRPDKVPM 

LQEFIINRLGRAFIEPPPFDLAKAFGDSNCCAPLIFV 

LSPGADPMAALLKFADDQGYGGSKLSSLSLGQGQ 

GPIAMKMLEKAVKEGTWVVLQNCHLATSWMPT 

LEKVCEELSPESTHPDFRMWLTSYPSPNFPVSVLQ 

NGVKMTNEAPKGLRANIIRSYLMDPISDPEFFGSC 

KKPEEFKJKLLYGLCFPHALVQERRKFGPLWWNIP 

YEFNETDLRISVQQLHMFLNQYEELPYEALRYMT 

GECNYGGRVTDDWDRRTLRSILNKFFNPELVENS 

DYKFDSSGIYFVPPSGDHKSYIEYTKTLPLTPAPEI 

FGMNANADITKDQSETQLLFDNILLTQSRSAGAG 

AKSSDEVVNEVASDILGKLPNNFDIEAAMRRYPT 

TYTQSMNTVLVQEMGRFNK1.LKTIRDSCVNIQKA 

IKGLAVMSTDLEEWSSILNVKIPEMWMGKSYPS 

LKPLGSYVNDFLARLKJLQQWYEVGPPPVFWLSG 

FFFTQAFLTGAQQNYARKYTIPIDLLGFDYEVMED 

KEYKHPPEDGVFIHGLFLDGASWNRKJKKLAESH 

PKILYDTVPVMWLKPCKRADIPKRPSYVAPLYKT 
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SEQ 

ID 

NO: 

Method 

Predicted 
beginning 
nucleotide 
location 

corresponding to 
first amino acid 
residue of 
peptide sequence 

Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 

Amino acid sequence (A=Alanine C=Cysteine, D=Aspartic 
Acid, E=G!utamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, Q=Glutamine, 
tx — /\rgiJi iiiC) o — oci Hie, i i hi ct/iiiiiCf T t •tunc, 
W=Tryptophan t Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible nucleotide 
insertion 





SERRGVLSTTGHSTNFVIAXMTLPSDQPKEHWIGR 
GVAI 1 CO! NS 

3958 

A 

35 

529 

GADMAKSKNHTTHNQSRKWHRNVIKKPLSQRYK 

SLKGVDPKFLGNMCFTKKHKKKGLKKMQADSA 

KAVSTCAKA1EALVKPKEVKPKIPKGVSCELN*LA 

YIAYPKFWTCACAC1AKGLRLCQPKAKAQDQTK 

AQVQIKAQAAAPASVPTQAPKGAQAPTKASG 

3959 

A 

1883 

763 

LLVLLLRTNLLIASSTR1SRATLTCSPPGIPVDPRVR 
PRVRSHLVMYLGITTGSLHKAVVSGDSSAHLVEEI 
QLFPDPEPVRNLQLAPTQGAVFVGFSGGVWRVPR 
ANCSVYESCVDCVLARDPHCAWDPESRTCCLLSA 
PNLNSWKQDMERGNPEWACASGPMSRSLRPQSR 

VPEASSTVYNGSLLLIVQDGVGGLYQCWATENGF 
SYPVISYWVDSQDQTLALDPELAGIPREHVKVPLT 
RVSGGAALAAQQSYWPHFVTVTVLFALVLSGALI 
ILVASPLRALRARGKVQGCETLRPGEKAPLSREQH 
LQSPKECRTSASDVDADNNCLGTEVA 

3960 

A 

1 

481 

SYAAPSLFVKSLYWALAFMAVLLAVSGVVIVVLA 
SRAGARCQQCPPGWVLSEEHCYYFSAEAQAWEA 
SQAFCSAYHATLPLLSHTQDFLGRYPVSRHSWVG 
AWRGPQGWHWIDEAPLPPQLLPEDGEDNLDINCG 
ALEEGTLVAANCSTPRPWVCAKGTQ 


TABLE 9 


SEQ ID NO: 

Accession 
Number 

Species 

Description 

Smith 

Waterman 

Score 

% Idenity 

3937 

Y27700 

Homo sapiens 

Human secreted 
protein encoded by 
gene No. 12. 

193 

25 

3938 

AF093097 

Homo sapiens 

putative RNA-binding 
protein Q99 

3881 

84 

3939 

AB012308 

Anthocidaris 
crassispina 

B2HC 

4169 

74 

3940 

U 10248 

Homo sapiens 

ribosomal protein L29 

787 

95 

3941 

Y99418 

Homo sapiens 

Human PR01317 
(UNQ783) amino acid 
sequence SEQ ID 
NO:277. 

4031 

100 

3942 

AL023516 

Gallus gallus 

B locus C type Lectin 

198 

35 


5 

TABLE 10 


SEQ ID 
NO: 

Accession No. 

Description 

Results* 

3937 

PR00049 

WILM'S TUMOUR PROTEIN 
SIGNATURE 

PR00049D 0.00 9.168e-l 1 209- 
224 

3942 

BL00615 

C-type lectin domain proteins. 

BL00615A 16.68 6.400e-ll 37- 
55 


* Results Include in order: accession number subtype; raw score; p-value; position of signature in amino acid 
sequence 
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TABLE 1 1 


SEQ ID 
NO: 

PFAM Name 

Description 

P-Value 

PFAM 
Score 

3938 

Piwi 

Piwi domain 

2.6e-150 

512.7 

3940 

Ribosomal L29e 

Ribosomal L29e protein family 

2.3e-19 

77.8 

3941 

Sema 

Sema domain 

4e-181 

615.1 

3942 

lectin_c 

Lectin C-type domain 

0.086 

-7.1 


5 

TABLE 12 


10 


SEQ ID NO: 

Position of end of 
Signal in Amino Acid 
Sequence 

MaxS (Maximum Score) 

Means (Mean Score) 

3941 

31 

0.985 

0.926 

3942 

21 

0.974 

0.894 

TABLE 13 


SEQ ID NO: 
of full length 
nucleotide 
sequence 

SEQ ED 
NO: of full 
length 
peptide 
sequence 

SEQ ID NO: 
of contig 
nucleotide 
sequence 

SEQ ID NO: 
of contig 
peptide 
sequence 

Priority Docket 
number 

corresponding SEQ 
ID NO: in priority 
application 

SEQ ID NO: in 
USSN 09/496,914 

3937 

3943 

3949 

3955 

787CIP2G 1 

787J587 

3938 

3944 

3950 

3956 

787CIP2G_2 

787_3813 

39^9 

3945 

3951 

3957 

787CIP2G_3 

787 4462 

3940 

3946 

3952 

3958 

787CIP2G 4 

787 4887 

3941 

3947 

3953 

3959 

787CIP2G_5 

787 5794 

3942 

3948 

3954 

3960 

787CIP2G 6 

787_8743 


TABLE 14 


TISSUE ORIGIN 

LIBRARY/ 

HYSEQ LIBRARY 

SEQ ID NOS: 


RNA SOURCE 

NAME 


adult brain 

GIBCO 

ABD003 

3940 

adult brain 

Clontech 

ABR006 

3940 

adult brain 

Invitrogen 

ABR014 

3940 

cultured preadipocytes 

Strategene 

ADP001 

3937 

adult heart 

GIBCO 

AHR001 

3940 

adult kidney 

GIBCO 

AKD001 

3940 

adult lung 

GIBCO 

ALG001 

3940 

young liver 

GIBCO 

ALV001 

3940 

adult ovary 

Invitrogen 

AOV001 

3938, 3940-3941 

adult spleen 

GIBCO 

ASP001 

3940-3941 

testis 

GIBCO 

ATS00I 

3940 

bone marrow 

Clontech 

BMD001 

3938, 3940 

bone marrow 

Clontech 

BMD004 

3940 

adult cervix 

BioChain 

CVX001 

3940 

endothelial cells 

Strategene 

EDT001 

3940 

fetal brain 

Clontech 

FBR006 

3940 

fetal brain 

Invitrogen 

FBT002 

3940-3941 

fetal heart 

Invitrogen 

FHR001 

3940 

fetal kidney 

Clontech 

FKD001 

3940 

fetal kidney 

Clontech 

FKD002 

3940 
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TISSUE ORIGIN 

i inn a r»x// 

LIBRARY/ 
RNA SOURCE 

UVCPA 1 TDD A OV 

HYokVj LI0KAK1 
NAME 

oty IJJ fNtJo. 

fetal liver-spleen 

Columbia 
University 

rLoUUl 

jyj I y J^hU 

fetal liver-spleen 

Columbia 
University 


^qij? 1Q41 

— ~— 

fetal liver-spleen 

Columbia 
University 

ft ^nni 
rLouuj 

194ft 

fetal liver 

uioniecn 

Fl V0ft4 

JTJ-' V UV4 

3940 

fetal skin 

inviirogen 

r orvuv 1 

1940-1942 

fetal spleen 

Dio^nain 

r or \j\f 1 

3940 

fetal brain 

GIBCO 

HFB001 

3937, 3940-3941 

infant brain 

Columbia 
university 

TJ*9ftft9 

1Q17 1Q1Q 1Q41 

j:7_) /, J7J7, J7H1 

— - — 

leukocyte 



194ft-1941 

leukocyte 

Clontech 

T 1 TPftftl 

1940-1941 

melanoma from cell line ATCC 

Clontech 


1940 

mammary gland 

— : — — 

Invitrogen 

lvliVlVJVJU 1 

1917 1940-1941 

neuronal cells 

ouaiegene 

TSITT TO0 1 

3937, 3942 

prostate 

L/ioniecn 

PTJTftftl 
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WHAT IS CLAIMED IS: 

1 . An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, a full length protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, a mature protein 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active domain 
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 
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11. A composition comprising the polypeptide of claim 10 and a carrier. 


12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1 ; and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 

1 9. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected fromm 
the group consisting of SEQ ID NO: 1-984; 1969-2952, 3937-3942 or 3949-3954, a mature 
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an active 
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under conditions 
sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 
3955-3960, the mature protein portion thereof, or the active domain thereof. 

2 1 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format. 

27. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 


483 


WO 01/57190 


PCT/USO 1/04098 


Pages 485 to 6221 of this application contain amino acid sequence listings. 
They can be obtained at the address given below. 

Les pages 485 to 6221 de cette demande contiennent des listages des sequences 
d'acides amines. Elles peuvent etre obtenues a I'adresse indiquee ci-dessous. 


World Intellectual Property Organization 
34, chemin des Colombettes 
CH-1211 Geneve 20 




